
De Duping?

Discuss CD+G's, VCD's, song book creators, and any other karaoke related software.
Post Reply
Unlimited MP3+G Downloads
Billy Bob Joe Bob 1958
Posts: 29
Joined: Tue Feb 23, 2010 11:54 pm
Location: Houston

De Duping?

Post by Billy Bob Joe Bob 1958 »

Thanks to converting/loading disk after disk after disk (after disk) onto a crusty old laptop hard drive, I now have three, four or five versions of too many songs.

It not only slows the search process, but it also makes it take longer to scroll through them all. hard drive is getting full.

Does anyone know of any SAFE and EFFECTIVE deduping programs?

Has anyone ever deduped their hard drive?

(I don't like running at 92% full...)


Posts: 2937
Joined: Wed Jan 31, 2007 2:15 am

Post by Bigdog »

Dupes...same song by different companies or same song same company?

If you mean dupes as the same song by different companies, I have many people asking for a song by a different company than I have listed. (I only list one version of each song. This being the one I believe sounds the most like the one on the radio.) People that sing at other karaoke shows are use to singing songs of companies I don't use as my first musical choice. Because they don't have them.

Proving I know nothing about karaoke. :shock: :lol:

It comes down to how happy you want your customers to be at your show. Even if I think the version I have listed sounds better I play the one they want.

My book doesn't list the dupes but my hard drive has them.

Hard drives aren't that expensive. If you are having slow search's either too much crap on the computer/hard drive or a slow computer.

Run a disc clean up and a defrag. My karaoke computers are strickly for karaoke. They only have essential karaoke running stuff on them. They never see the internet. Karaoke shows only.

I have 3 desk tops (2 karaoke only business stuff) (One internet/personal)

2 karaoke laptops (Karaoke only)

2 personal laptops (One karaoke business only) (one internet/personal)

If I want to keep my karaoke computing safe...NEVER internet.
Unlimited MP3+G Downloads
Billy Bob Joe Bob 1958
Posts: 29
Joined: Tue Feb 23, 2010 11:54 pm
Location: Houston

Post by Billy Bob Joe Bob 1958 »

Dupes...same exact songs by same exact publisher. Same exact file name.

I don't mind having three, four or five different versions of a song. Sun Fly. SC. Chart Busters. Legends. They all have their plusses and minuses.

But I hate DUPES!!!!!! I hate having four Sound Choice anything!

As I'm getting deeper into this, I'm seeing that Sound Choice and a few others have the irritating habit of changing one or two digits/letters in the file name so that the exact same file appears to be two completely different songs. (They're not, other than the filename, they're the same size and same file--identical.) What a pain!

In any case, I've seen a great deal about the deduping software used in virtual server and knowledge management environments--but that costs $20-50 thousand!!!

What I'm curious about is whether anyone has run any of the many cheap ($20-$50) deduping programs that pop up on a search. (I vastly prefer learning from the experiences of others rather than risking a major mistake myself.)

Defrag is a good idea. But what I'm really looking to do is execute some major cleanup!

As for your laptops never seeing the Internet, don't any of your venues have wi-fi? I LOVE being hooked in while doing a show.

For example, if someone is having trouble "remembering" a song, I ask 'em to sing or say the lyrics to me, I type them in, and voila, we have a song title! (People LOVE that!)

I also like to use Youtube to set the mood before the show. I'll create a playlist of 10 or 15 songs or so and run them on the monitors. It gets people psyched and can be used to set a theme. Like last week Aerosmith was in town, so the warmup had Aerosmith videos galore (and we called it "Aerosmith night", asking each of our regular singers to try at least one Aerotune).

I also love the internet because if someone asks for something I don't have--I can buy it quickly from Tricerasoft or somewhere similar. (People love that too!)

Oh well...

So from what I'm hearing, you don't have any dupes. How do you keep it so clean? (Are you transferring only one song at a time? Including dupes, I have 6,000 songs now. But it would have taken me years to get there one at a time!)

Unlimited MP3+G Downloads
Posts: 1498
Joined: Sun Jan 01, 2006 8:37 pm
Location: USA

Post by DanG2006 »

Bigdog is saying what I am about to say. I have 9012 songs right now. I am waiting on another delivery of a disc. But out of those 9012 songs, at least 300 are duplicates. Like BigDog I only print one version of the song unless it is redone by another artist or is a totally different version such as a live version verses a studio version. I have no problem having more than one version of a song on my hard drive because what I might consider the "best" version might not be considered the "best" by say bigdog.
Oh and as to downloads, I only download at home so if they want the song they can give me the $$ to pay for it and I will download it and have it for them by the next show.
Unlimited MP3+G Downloads
Posts: 2937
Joined: Wed Jan 31, 2007 2:15 am

Post by Bigdog »

Billy Bob Joe Bob 1958 wrote:Dupes...same exact songs by same exact publisher. Same exact file name.

I don't mind having three, four or five different versions of a song. Sun Fly. SC. Chart Busters. Legends. They all have their plusses and minuses.

But I hate DUPES!!!!!! I hate having four Sound Choice anything!

As I'm getting deeper into this, I'm seeing that Sound Choice and a few others have the irritating habit of changing one or two digits/letters in the file name so that the exact same file appears to be two completely different songs. (They're not, other than the filename, they're the same size and same file--identical.) What a pain!

In any case, I've seen a great deal about the deduping software used in virtual server and knowledge management environments--but that costs $20-50 thousand!!!

What I'm curious about is whether anyone has run any of the many cheap ($20-$50) deduping programs that pop up on a search. (I vastly prefer learning from the experiences of others rather than risking a major mistake myself.)

Defrag is a good idea. But what I'm really looking to do is execute some major cleanup!

As for your laptops never seeing the Internet, don't any of your venues have wi-fi? I LOVE being hooked in while doing a show.

For example, if someone is having trouble "remembering" a song, I ask 'em to sing or say the lyrics to me, I type them in, and voila, we have a song title! (People LOVE that!)

I also like to use Youtube to set the mood before the show. I'll create a playlist of 10 or 15 songs or so and run them on the monitors. It gets people psyched and can be used to set a theme. Like last week Aerosmith was in town, so the warmup had Aerosmith videos galore (and we called it "Aerosmith night", asking each of our regular singers to try at least one Aerotune).

I also love the internet because if someone asks for something I don't have--I can buy it quickly from Tricerasoft or somewhere similar. (People love that too!)

Oh well...

So from what I'm hearing, you don't have any dupes. How do you keep it so clean? (Are you transferring only one song at a time? Including dupes, I have 6,000 songs now. But it would have taken me years to get there one at a time!)

You just told us every reason for your computer running so's bogged down with too much internet crap.

When your computer gets killed by a virus and your show stops dead and you make no money and it costs you money to get it fixed maybe you will understand.

You have too many dupes because of the same thing I did. Trying to make everyone instantly happy costs money.

I have over 22,000 song files on my hard drive. Searching speed is not a problem. Of the 22,000 songs only 12,000 are one of a kind. The rest are dupes. That means I wasted 12,000 songs worth of money or about $20,000. :shock:

No customer is worth an instant download and all the problems that could come with it by way of a virus. Bar owners are lucky yo be open let alone have wi-fi.

I have wasted money buying special request songs. They never get sung. Or maybe sung once.

As long as you continue to use the karaoke computer and the internet you will have a slow computer. You're lucky it hasn't died yet. I just had a virus kill my home laptop. It needs to have the hard drive replaced. It can not be cleaned with anything. If that was my karaoke show computer I'd be screwed. It's why I have a back up karaoke laptop. And why they never go on the internet. Dupes are the least of your worries. A good virus will delete them for you. :shock:
Unlimited MP3+G Downloads
Posts: 674
Joined: Tue Apr 28, 2009 5:41 am
Location: Dundee, Scotland

Post by mnementh »

Billy Bob Joe Bob 1958 wrote:Dupes...same exact songs by same exact publisher. Same exact file name.
If what you say above is correct, then it's very easy to get rid of duplicates.

Simply Google for Freeware duplicate finder and you should be able to get any number of pieces of software that will do the job.

However, are you SURE the file names are IDENTICAL?????

If the names differ, even by the slightest amount, then no dupe finder I have been able find will flag them as dupes.

For example;

SF001-01 - Maggie Mae - Stewart, Rod.ZIP
SF001-01 - Maggie mae - Stewart, Rod.ZIP
SF001-01 - Maggie Mae - Stewart Rod.ZIP
SF001-01 _ Maggie Mae _ Stewart, Rod.ZIP

No duplicate finder will differentiate between these files as there is a slight difference in each.

Before you run your dupe finder, you MUST ensure that ALL your filenames are consistant.

Also, as Bigdog has said, you are basically crippling your laptop by running network stuff on it, alongside the Karaoke.

My Karaoke laptop does NOT, under any circumstances get connected to the web.

It's a Karaoke machine, nothing else.

Unlimited MP3+G Downloads
User avatar
Site Admin
Posts: 1906
Joined: Wed Aug 18, 2004 5:05 pm
Location: WV

Post by wiseguy »

The unfortunate fact is that there is no software that will do what you need. Thanks to no standard naming convention between the song manufacturers there probably never will be. I know it's a pain but accurately sorting out dupes is something you have to do manually.

I have no fear of having my karaoke laptop connected to the internet. An internet savvy person knows how to avoid malware and viruses. Have a firewall and a good and updated anti-virus in place (I use Vipre with Counterspy), only access trusted sites, and there will never be a problem.
Unlimited MP3+G Downloads
Posts: 2937
Joined: Wed Jan 31, 2007 2:15 am

Post by Bigdog »

There is really no good reason to be on the internet during the karaoke show.

No customer is worth an instant download. Especially if they are watching over your shoulder.

I edit my song selection for undesireable content. I can do that in my house in private. At a show you can't do that. I may not want to "find" a certain song. It has nothing personal to do with the singer that requests it. It has to do with the content.

The slightest chance of a virus ruining a show is enough reason for me to keep away from it. less programs running in the background means less trouble. I'm having trouble now with only Vista and Sax & Dottys on my computer.

I do all my song books and ripping on the house computer and transfer to the karaoke computers.
Unlimited MP3+G Downloads
Posts: 674
Joined: Tue Apr 28, 2009 5:41 am
Location: Dundee, Scotland

Post by mnementh »

wiseguy wrote:The unfortunate fact is that there is no software that will do what you need. Thanks to no standard naming convention between the song manufacturers there probably never will be. I know it's a pain but accurately sorting out dupes is something you have to do manually.
To a certain extent, I do agree with the comments above.

However, I see no reason that the job can't be made a bit easier with suitable software, as in my post, Re. renaming files automatically.

My "Holy Grail" is to find a partial filename search duplicate finder and while I haven't found one yet, I live in hope.

Here, however, is one that comes pretty damn close;

This dupe finder is freeware and it has a user definable wildcard mask.

Why this is useful is because, while filenames can have differing general formats, MOST have the correct disc identification format, e.g.



For example, using the user defined mask, you can set up a search for;


This will find all names starting with SF plus 6 other characters while totally ignoring the rest of the filename and other attributes of the file.

You can then tag the files you don't want and delete them.

As in my example in the earlier post,

SF001-01 - Maggie Mae - Stewart, Rod.ZIP
SF001-01 - Maggie mae - Stewart, Rod.ZIP
SF001-01 - Maggie Mae - Stewart Rod.ZIP
SF001-01 _ Maggie Mae _ Stewart, Rod.ZIP

Using the wildcard search above, WILL flag these files as dupes because the search doesn't care about anything after the -01

Using defined letters like SF will restrict your search to a specific manufacturers names.

Important note:
Make sure you start with the most specific name you can use, e.g. search for SFMW??????.* first, then SFG??????.* then SF??????

This will minimise chances of mistakes.

Simply using the question mark symbol (?) for the number of charaters you want to search for will enable wide ranging searches to be made.

Yes, there is still a manual selection process but at least it will save you trawling through 1000's of files that aren't dupes.

P.S. as before, apply common sense make a backup first and you won't lose any files.
Do NOT come crying to me if common sense isn't applied. That's YOUR job!
Unlimited MP3+G Downloads
Posts: 2937
Joined: Wed Jan 31, 2007 2:15 am

Post by Bigdog »

There are only 2 companies I can name now that are duping songs on every other new disc released.

Sound Choice and Chartbuster :shock:

99% of my dupes are because different compnaies made the same songs.
Unlimited MP3+G Downloads
Posts: 7
Joined: Fri Aug 20, 2010 2:50 pm
Location: Middle TN

Post by berryoke »

man all of my venues are online, I've never had an issue with virus or slow computing. I am an extremely savvy pc user though. Never had a virus on my karaoke or home pcs. You just need to take the proper precautions and learn how to use the pc/internet. Don't be an idiot who just points and clicks on everything.

The internet is an awesome tool that when used properly can completely change you show. I stream my shows live online audio and video which people love and use, if I don't have a song I can get it from tricerasoft and it makes me look like a hero, and there are so many other uses...
Unlimited MP3+G Downloads
Posts: 2937
Joined: Wed Jan 31, 2007 2:15 am

Post by Bigdog »

berryoke wrote:man all of my venues are online, I've never had an issue with virus or slow computing. I am an extremely savvy pc user though. Never had a virus on my karaoke or home pcs. You just need to take the proper precautions and learn how to use the pc/internet. Don't be an idiot who just points and clicks on everything.

The internet is an awesome tool that when used properly can completely change you show. I stream my shows live online audio and video which people love and use, if I don't have a song I can get it from tricerasoft and it makes me look like a hero, and there are so many other uses...
Sound Choice lawyers are looking for heros like you. Streaming your show is playing with fire. You don't even have the proper copyright permission to play your karaoke music out in public for a profit, let alone stream it live on the net. You are in copyright violation many times. :shock:

That's all I'm gonna say about it....beware. :roll:
Unlimited MP3+G Downloads
Posts: 674
Joined: Tue Apr 28, 2009 5:41 am
Location: Dundee, Scotland

Post by mnementh »

Back on the subject of "de-duping", I've been working on this for a while now and in the same manner as automatically naming files from a text file, I believe there has to be a way to do this that will take away a large part of the tedium involved.

To this end, I've been writing some code for Excel that will, hopefully be a start.

It's pretty simple at the moment, just a routine to display your Karaoke files in a spreadsheet and be able to do some manipulations to identify duplicates by either Disc ID, Artist or Song.

In a test scenario, I can list ~2,000 files and identify duplicates in a given field in less than 10 seconds.

I hope, at a later stage to be able to automatically delete identified files and swap artist and song sections.

I hope to post this tonight or tomorrow.

Unlimited MP3+G Downloads
Post Reply