Jump to content
lizhubbard

Master source list has multiple duplicate entries

Recommended Posts

Some time ago, I created a subset of people in my project and created a new dataset from them, to do some separate research for my daughter-in-law's parents. I discovered that there were subsequently two copies of every source in the master source list. If I filtered by dataset, it showed only the ones for the active dataset, but I had to do that every single time I opened the list, so I usually didn't bother, and just got used to having 1500 sources instead of 750.

 

I finally found the two-dataset approach to be unworkable, and merged the datasets. I've now finished merging the 47 extra people, but Help! now the Master source list has three!! copies of every source! Two of each source have no citations. I suspect this replication had something to do with the merge.

 

I have been slowly going throught the list and deleting the two "extra" entries, but it is a very slow process, and there are about 750 sources to get through (tripled!)

 

Is there a more efficient way to do this? And how can I avoid this replication in future? Thanks for any insights,

--Liz

Share this post


Link to post
Share on other sites

Hi Liz,

 

If you have not done so it would be worth your time to review Terry Reigel's web page about

Merging Projects and Datasets. This will explain what you are seeing with what appear to be two or three sources that refer to your different datasets that are in the same project.

 

Not sure how you are "deleting" the "extra two sources". Again, it depends on which dataset in the project you are working on at the time. Rather than delete, have you looked in to the Merge capability on Master Source List. If these sources really are identical, that may be a better way to go. You may be deleting source references that you are actually using.

 

If you have also linked repositories to your sources you have an additional issue to clean up. If you merge two sources, each linked to the same repository, you will end up with the same repository linked twice (one marked primary and the other not if both were marked primary before the merge). It's up to you to also clean that up. While a source can be linked to multiple repositories, only one may be marked “primary”, and that is the only one that will appear in reports, so it may not cause you problems, but should be addressed.

 

Hope this gives you ideas,

Share this post


Link to post
Share on other sites

Liz, I think one copy of all the sources is from the second data set. If you have merged the two now (actually, "merge" means you copied the people, sources, etc., from one to the other, but both remain in your project) you can now delete that second data set. That's all discussed in the article that Michael referred you to above.

 

That will likely still leave you with two copies of each source. Assuming that each source in each pair is cited for various people, the thing to do in then merge the pairs, which automatically transfers the citations from the source that's being deleted to the other one.

Share this post


Link to post
Share on other sites

Michael,

Thank you for your reply. I had read all the help files, and Terry's tips, and chose to merge the two datasets ("There are some very good reasons to merge Data Sets, including merging Data Sets one originally created separately and now wants combined.").

 

I suspect that the issue of three copies of each source (I'm a "splitter" not a "lumper" so they are very specific) may have been my own doing. The process of creating a new dataset, then merging it back into my main dataset seems to have dragged along all the sources each time. No doubt that was a poor choice I made at the time of creating the new dataset.

 

It's clear in the now-merged master source list that the first one of each set of three (lowest source id number) has been cited, and the other two (numbered in two much larger sequences) have no associated citations. All the sources are in the same dataset. A few instances merit merging as you note.

 

In hindsight, it's possible that copying individuals instead of merging may have worked better, but would also have been time-consuming. It has been somewhat tedious, but I have nearly finished deleting the excess 1500 or so sources. I have noticed too, that repositories have been duplicated or triplicated, and will deal with them the same way.

 

What would have worked the very best, would be not to have created a separate dataset in the first place. From now on, everyone stays in one dataset, in one project!!

--Liz

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×