Back

StemRoller – Isolate vocals, drums, bass, and other stems from any song

391 points8 daysgithub.com
ksherlock8 days ago

Not to be dismissive, but as far as I can tell, the heavy work is done by facebook's demucs and this is an electron front end to run the demucs cli (and I guess search youtube for videos to download). The demucs project page has more information.

https://github.com/facebookresearch/demucs

lapink8 days ago

Original Demucs author here. Thanks for putting forward our research!

I’m definitely happy to see more front ends for Demucs being developed and to read that it has been useful to other musicians!

We are working on the next iteration of the model, and with more sources, hopefully released by the end of the year :)

If you are interested in this research you can follow my Twitter (@honualx) or star the Demucs repo.

game-of-throws8 days ago

I'm curious, what is the business justification for funding development of Demucs, if you don't mind me asking? It doesn't seem very related to FB's core business.

FLT88 days ago

Solving problems like audio source separation (eg. Distinguishing multiple speakers in a noisy environment, or picking speech out of a background where music is playing) seems very much in FB's wheelhouse.

lapink8 days ago

The goal of Meta AI Research is to do open research, not necessarily with direct applications at the time we start it. Indeed, the architecture, or the lessons learnt working on it can become useful later for the company, for instance for remote presence with VR, to isolate the main speaker from its environnement ( https://arxiv.org/pdf/2206.15423.pdf ).

nickserv7 days ago

Just a guess here, but I wouldn't be surprised if it's used to better spy on your messenger audio conversations. They already listen in and will pick up keywords to populate your FB ad stream.

+1
ec1096857 days ago
dekhn6 days ago

Hi, I just downloaded demucs yesterday and started using it. It's amazing! I really appreciate all the work you put into making it easy to install and understand.

Is there any chance you can disentangle guitar and keyboard? I work a lot with Grateful Dead music and I'd like to be able to pull jerry's guitar out from the keyboard from live shows. Similarly, it would be cool if you could parse shpongle into its consituent tracks, but I think that's probably impossible.

braindead_in8 days ago

Is there something similar for separating different voices from spoken audio?

lapink8 days ago

Yes there are, you can have a look at https://github.com/etzinis/sudo_rm_rf for instance for 2 speakers separation. There is also this one for 3 speakers: https://huggingface.co/speechbrain/sepformer-whamr

pininja8 days ago

There’s no need to be dismissive since they say this in the first sentence. Preparing an easy to use app for all platforms probably does get this into more creative hands, and that’s a net-positive contribution I can appreciate.

SebaSeba8 days ago

They did not prepare it for all platforms though. Linux is missing.

pininja8 days ago
amelius8 days ago

It should be in the title.

TimTheTinker8 days ago

It does seem rather disingenuous that the product page makes no mention that the author didn't do the heavy lifting, and that at the same time it features a prominent donation button.

If I didn't know who did the real work and benefitted a lot from this tool, I'd give to StemRoller in proportion to my gratitude -- which I'm sure others are liable to do.

frob8 days ago

It's in the first paragraph of the README in the github repo and th3 second paragraph on the website. I'm not sure what more can be asked of the author.

+2
TimTheTinker8 days ago
thisiswater8 days ago

Tried splitting a complex arrangement (Chicago by Sufjan Stevens). Drums bass and vocals come out fairly well, though the drums stem seems to lack other percussion elements outside of the core rock drumkit (e.g. tamborine), and cymbals hits are clipped rather than ringing. The 'other' stem, the rest of the instrumentation, keeps a fair bit of the percussion and there's bleed from the vocal melody.

The backing vocals seem to have disappeared for the most part, and are only audible in the vocals stem when the lead vocal is present (like they're reverse-ducked? Been a while since I did any production, the terms have escaped me...).

Not much use with complex arrangements to be honest, I was hoping to get things like the strings section separated from the rest of the arrangement.

Original: https://www.youtube.com/watch?v=tWX3El-slpY

Output: https://file.io/etpOQt57ziKe

pcf8 days ago

Did you use a FLAC/WAV file? That should yield the best results.

(Only asking because you linked to YouTube, and I'm not sure if you used the YouTube audio for your source.)

thisiswater8 days ago

Perhaps you're right, I'd have to check.

I typed the song in the search and pressed the first likely result, which is the youtube video I linked. Using the software as intended I believe.

marssaxman5 days ago

YouTube audio is optimized for bit rate, not quality (128K MP3). You will get better results with a higher-bitrate MP3 (320K would be good), better still with an uncompressed format like FLAC or WAV.

BitPirate8 days ago

Makes sense. MP3 tries to compress without loosing information in the hearable spectrum of a human but that information can still be processed by algorithms.

NonNefarious8 days ago

I can't find any way to do this.

djcannabiz8 days ago

I tried throwing some underground rap artists at this app, as stem splitters usually struggle with them

I split https://www.youtube.com/watch?v=DDaL7KBjkDI

And it gave me this https://www.dropbox.com/sh/inyk38n2jrp5i45/AACpB0xXNFxamEmP3... I noticed some weird hissing with the 808s, but other then that it sounded pretty good

For more of a challenge, I inputted https://www.youtube.com/watch?v=uAwQ3njiU4M

and it came up with https://www.dropbox.com/sh/97lzke0puh9dzeo/AACE75vsbNS43UqqH... It was able to separate some of the kicks from the 808s, which is really impressive to me!

Overall, I'm very impressed! This sounds much better then lalal.ai to me

polishdude208 days ago

I'd like to take a moment to mention how great dropbox's audio seeking thing is. It's super fast and works as intended. Great work whoever implemented this.

pelagic_sky8 days ago

I’ve found Lala to be my go to. If this is better, then I’m very interested in trying it out.

pelagic_sky8 days ago

Just a follow up. My two conversions so far, Lalal.ai has been better. Especially separating drums from instruments. I'll give Stemroller a few more tries as I am always looking for options.

pelagic_sky6 days ago

Update number three. I now just use both lalal and stemroller because each one seems to do better in certain cases. If I hadn’t paid for lalal, I’d probably just use stemroller as it’s way better than RX9

djcannabiz8 days ago

what genre of music, may i ask?

pelagic_sky6 days ago

RNB, NeoSoul, Trap

metadat8 days ago

Why do vocals.wav, other.wav, and instrumental.wav all start out the exact same (with vocal sounds)?

squeaky-clean8 days ago

Super impressive splitting there, wow. Just curious, was your source a lossless or compressed file?

djcannabiz8 days ago

The second file was lossless, the first was ripped from a CD.

elaus8 days ago

This seems to run just fine under Linux as well, not completely out of the box though: It's basically missing builds and config for Linux which can be build analogous to the existing Win/Mac stuff.

You also have to build the demucs-cxfreeze dependency (as described in its repo, https://github.com/stemrollerapp/demucs-cxfreeze).

elaus8 days ago

It's almost eerie how well this works with electronic music. Coming from an age where your best try to separate a track was using equalizers, I didn't have high hopes.

Trying it out with Alan Walker's Alone, it separates the vocals and drums almost perfectly. Bass is really fine as well, only instrumental and 'other' was a bit mixed up in my try.

knicholes8 days ago

Whenever I see an "##Installation" section with more than one step, I immediately call DOCKER!

dylan6048 days ago

"Download and extract the latest ffmpeg snapshot from evermeet.cx and place the ffmpeg executable inside"

Why? Why can't this just point to the location where ffmpeg is rather than making a copy of ffmpeg? symlink might work, but just do a $(which ffmpeg) or ask the user for the path ~/bin/ffmpeg /usr/local/bin/ffmpeg etc

PaulDavisThe1st8 days ago

ffmpeg has not had a stable command line interface for some time. It can be a problem to assume that the system-installed version accepts the arguments you plan to give it.

Rodeoclash8 days ago

It's even easier than that. There's a few npm libs around that are dedicated to shipping a copy of ffmpeg with electron.

dylan6048 days ago

even easier than what i already have on my system? what are you saying here, as it makes no sense to me

linux26478 days ago

Maybe there’s some feature of bleeding edge ffmpeg that’s required for the app

setgree8 days ago

Open Culture recently posted a link to Abbey Road but with only Paul's bass lines, but the actual content got taken down. [0] It was really cool though, in part because it's not nearly as precise as I would have thought, which made it feel really organic.

[0] https://www.openculture.com/2022/04/hear-the-beatles-abbey-r...

TylerE8 days ago

In the real world where tracks are cut live, there is a fair bit of microphone bleed

salmo7 days ago

I imagine studio-era Beatles in particular would be difficult.

Microphone bleed, lots of overdubs (especially vocals), and repeated re-layering tracks on tape over and over due to channel limitations. They really were doing crazy stuff with limited tech.

I think this would be hard for bands that really fill the spectrum and don’t have that clean treble, mid, bass separation. Or recordings really compressed into a frequency range.

Now this makes me want to see what happens with like My Bloody Valentine and Husker Du :).

hammock8 days ago

Especially in the day and style that the Beatles recorded. Today, not so much

tiagod8 days ago
phonescreen_man8 days ago

Been using demucs for a couple of weeks now, mostly taking my early produced music which I have since lost the project files for and giving them a remix and update. Gotta say I have been blown away by how good demucs is. I installed it following the repo instructions and then created a zsh alias to run it with any file name. Eg $ai_split mySong.mp3

Wait fifteen minutes and out pops four stems, flawless so far, even been messing around with mainstream tracks and using ableton with warp applied to quickly build out remixes. Demucs is going to be /is already a game changer!

eyelidlessness8 days ago

This testimonial almost has me wanting to try it on an “album”[1] I recorded when I was in a “band”[2] in high school. I too lost all of the source files[3].

1: On second thought maybe not. It has not aged well.

2: Me and another kid, with a guitar, a pre-OS X Mac, a pirated copy of Rebirth, a pirated copy of SoundEdit 16, and literally the mic that Apple used to include with (some?) Macs. I’d back-reference[1], but our equipment was not the problem. Well, except for [3].

3: I learned my lesson: I should have been older and had a job that would afford me a backup drive, so I could sample the sounds of that dying HDD and retcon the samples into my “album”[1].

pininja8 days ago

That’s awesome! I wonder if there are projects to create a repository of pre-split public domain music? Seems like something the internet archive could host once created.

phoe-krk8 days ago

Are there any public examples of the split audio files?

Cerium8 days ago
gaudat7 days ago

That playlist cover can definitely pass as an album art.

Dwedit8 days ago

Let's see how long it takes for some new Neil Cicierega remixes to appear now.

intvocoder8 days ago

With a tool like this, you could get back into the animutation scene. (Edit: I guess it's a bit of a non-sequitur, but I enjoyed Suzukisan, so there's that.)

chriscjcj8 days ago

Is there a way to process my own audio file rather than choosing one from YouTube?

NonNefarious8 days ago

A couple of commenters have mentioned using lossless files, but so far no one has said HOW.

nextaccountic8 days ago
NonNefarious7 days ago

Thanks. The comments seemed specific to this front-end, but maybe.

nr2x8 days ago

How is this similar/different than the Deezer one?

ksherlock8 days ago

I just did a quick test of demucs vs spleeter:4stems. demucs is significantly slower but the output is better.

in a semi blind comparison, I prefer demucs for all 4 tracks (drum, bass, vocals, and other). bass and other stand out the most so let me say a couple words about them.

bass - the demucs bass has less bleed from other instruments and the volume is consistent throughout. with spleeter, the volume varies a lot and there are multiple sections of 1-2 bars where it just drops out completely. In Capo, the demucs spectrogram is nice and clear whereas spleeter tends to look like pencil smudges for the most part.

other - with spleeter, whenever there are vocals, the other instruments turn to mush. demucs is much better. Oh, you can tell people are singing -- the instruments get muffled -- but you can still hear them.

anigbrowl8 days ago

It's pretty decent. I threw a drum'n'bass track at it to see how it would cope with heavily produced material and the results were surprisingly good.

CharlesW8 days ago

I'd also be interested in how it compares to iZotope RX's Music Rebalance (examples from earlier releases here: https://www.izotope.com/en/learn/stem-isolation-music-rebala...).

avis8 days ago

I'd be interested to know how it compares to iZotope as well as phonicmind.

pcf8 days ago

I just checked "Californication" (used for all their other examples here: https://soundcloud.com/honualx/sets/source-separation-in-the...) in RX9 Music Rebalance with the setting to "best", and I wasn't very impressed.

Seems like this tool might be better than Izotope's.

eshack948 days ago

I dabble in audio production in my free time outside of work, and I typically will use iZotope RX 9 or Neural Mix Pro for isolating vocals or stems. However, these are paid products, and it's encouraging to see more open source projects being built around this space.

I like the opportunity to view the source code and learn from it, as opposed to most paid products which are typically closed-source and a bit of a "black box".

Sure - this is mostly just an accessible frontend for Demucs, but that's still okay. The author clearly indicates that in his repo, giving credit where credit is due. Additionally, this helps less-technical creators be creative in new ways.

Thanks to all who contributed.

yarg8 days ago

Honestly, this sort of thing is cool; but why (in general) is it necessary in the first place?

If the elements of the song are recording in isolation - which they are in all studio versions, why can't we just move to a format that supports the layering?

gavinray8 days ago

Musicians and studios don't generally tend to offer the public access to original stems for songs (why would they?)

Say that you want to make a remix, mashup, or otherwise use sound-bytes from a song. The easiest thing to do is use a tool like Spleeter/Demucs to separate the source layers so that you can then further process them in your DAW.

This is what I do, but I just use the Demucs CLI because it's simple enough.

https://github.com/facebookresearch/demucs

pabs38 days ago

Are there no communities of "open source" music? It sounds like the stems are part of the "source code" for tracks.

jononor8 days ago

Many niches in electronic music have small knit communities of creators and producers that regularly remixes each-others stuff. But it is not an open community, you gotta have a decent standing (from making own music or prior remixes) before someone is willing to send you their stems. For anyone musician that has a label/publisher, they also need to be in the loop, for handling of the royalties. So sharing stems happen regularly in the music industry, but it is not easily accessible. Which makes tools like the one mentioned very useful for everyone else that would like to participate.

osigurdson8 days ago

It isn't really in the best interest of the artist to provide this. The final mix is part of the overall product / work of art. Providing all of the individual tracks (there could be 30 or more in total) would also take up a lot of space / increase processing requirements while benefiting very few.

spyrefused8 days ago

I usually use this kind of tools to get the bass score of some songs, for example. With the isolated elements it is much easier to know exactly what notes are sounding (I don't have a good ear). The same for drums or synth notes.

As after all the sound quality doesn't interest me too much to do this, I usually use iZotope RX, but I will try this tool.

amelius8 days ago

This is like asking why we need decompilers.

yarg7 days ago

> (In general)

Yes, I agree.

atoav8 days ago

For all who look for something like this, iZotope RX (the audio retouche software) has a function called "Musical Rebalance" which is great for reducing spill or changing the balance in a live recording.

ccn0p8 days ago

talk about a missed opportunity without examples. did I miss them somewhere?

nerfhammer8 days ago

I've always wanted a way to extract just the kick drums in realtime but I don't understand this field well enough to understand whether it would be remotely possible or not.

jononor8 days ago

You want just the beat, ie the time markers of each kick? Or you want the isolated sound (ie audio) of each kick? Both are generally possible today, though the approach will differ a little bit.

screech8 days ago

Just wow! There were methods extracting acapellas from tracks, but this tool here is another level. Fascinating how good the results are.

polishdude208 days ago

This is awesome! Tried it out on Rush's Tom Sawyer and it splits out the vocals great! I can see this being super useful!

abbusfoflouotne8 days ago

Would appreciate an easier way to download and run this! The steps on the readme are pretty long, at least for me (Mac user)

interestica8 days ago

How does it compare to lalal.ai ?

threefour8 days ago

It's free.

amelius8 days ago

And otherwise identical?

kbob6 days ago

Demucs did a much better job of isolating the bass on a blues track than LALAL. The bass actually sounded like a bass. LALAL got the note pitches but lost their attacks.

colecut7 days ago

Anyone else just getting 'failed' on every song they try?

NonNefarious8 days ago

How do you load a local file?

diimdeep8 days ago

There is no support for a such thing, this is software in year 2022, never local, online first.

NonNefarious7 days ago

Hahah, I know, right? People actually believe that shit... until they get jacked by a service provider.

volkse8 days ago

Is there a VST front end?

anderfernandes18 days ago

Wow

raydiatian8 days ago

How does it perform compared to Deezer Spleeter or lalal.ai

Else who cares