A data hoarder claims he's downloaded all 900TB of SoundCloud
SoundCloud fans really don't want to see the streaming service go.
One user, posting on Reddit's thread for "data hoarders," claimed he's downloaded the bulk of SoundCloud's public archive, and that it's "only" 900TB.
He posted in response to reports that SoundCloud only has funding to last it 50 days, and that users should back up their favourite tracks. SoundCloud has denied those reports, saying it's "here to stay."
Posting under the name "makemakemake," the Reddit user claimed he ripped SoundCloud's files over a connection of 80Gb/s, which would have taken more than a day.
He posted a temporary 100GB log file to prove he did it, though due to the size Business Insider hasn't verified that file.
Makemakemake didn't give precise technical details, but did reference using Google's cloud computing services, rather than a home connection. SoundCloud, meanwhile, is hosted on Amazon's cloud computing platform.
Makemakemake doesn't plan to make a replica or "mirror" version of SoundCloud available to the public. He wrote: "It'd be quicker to download everything from [SoundCloud] yourself. Only have 10G to where the data is now."
It isn't clear why SoundCloud hasn't detected and blocked such activity through rate limiting. The company hasn't responded to a request for comment.
The enterprising Redditor isn't alone in trying to archive SoundCloud.
The Verge reported that The Archive Project, which tries to preserve sites that are about to shut down, will undertake a large-scale backup of SoundCloud this week.
The Archive Project backs up at-risk services with the help of volunteers running its virtual archiving appliance, ArchiveTeam Warrior. Volunteers give up some bandwidth and disk space to help scrape those sites. Other defunct sites preserved by the project include GeoCities, TwitPic, and Google Video.
The project only backs up sites it thinks are at risk of shutting down. SoundCloud this month cut 40% of its staff, and earlier this year it warned it might run out of cash. CEO Alex Ljung has batted off reports the company's about to shut down.
But ArchiveTeam cited the layoffs, cashflow issues, and an earlier legal case involving unpaid royalties as good reasons to preserve the site.
So far, it said, "rate limiting has not had an effect" on its efforts to grab SoundCloud's API data.