Improve quality of automatic metadata extraction
Improve the quality of the automatic metadata extraction; add automatic retrieval of metadata from arXiv, PubMed etc.
Ian Phillips commented
If I drop a PDF into Mendeley it does a metadata search on the pdf, typically recognizes the title, year, authors and DOI (if present) with no issues. It typically struggles to recognize the Journal Title. If it finds the DOI (or ArXivIF or PMID) then it will update the metadata, e.g. journal title / URL. It doesn't add the abstract, Author Keywords or Tags.
While this data may not be available from the DOI directly, it typically is via the URL. Please can this be corrected?
Conversely, if the DOI is found, then you have the option of "Search by Title". This will typically find the URL, from which all other metadata can be found.
Currently the process is very manual, which stops people from moving from one management system to another.
Sebastian XX commented
When using the web importer on SpringerLink to importing a book chapter it is falsely importing the whole book. BibTex import results in an empty entry.
Patryk Kubiczek commented
For the new style arXiv identifiers (http://arxiv.org/hypertex/bibstyles/) the fields: archivePrefix, eprint and primaryClass are necessary in BibTex file . At the moment Mendeley provides primaryClass only for eprints from before 2007, while for new papers it gives just arXiv ID. I believe you should focus on that problem.
Michael Johnson commented
My Mendeley plugin for Safari is consistently unable to pull metadata from ScienceDirect papers, even when I'm sitting on a page with all the bibliographic info, abstract and a direct link to PDF. Often, the metadata found in google scholar is inadequate. A more consistent and comprehensive search source such as pubmed (for medical literature) would be appreciated as an option.
Kate Everly commented
I would just appreciate being able to see all the importable fields on the web import window when I first select an item to "Save to Mendeley." If I could correct any errors immediately it would save me having to go find the document in the desktop, go back to web to review and then make the corrections.
Carl Edlund Anderson commented
There are just weird (and what I would think are easily fixable) errors in the way Mendeley gets the metadata from JSTOR PDFs. The way JSTOR presents its metadata within the PDF is a bit old-fashioned (if human-friendly), but quite regular. There's really no reason for Mendeley to confuse the download date with the publication date, etc.
improve the quality of automatic metadata extraction for JSTOR's pdf !!!!!
Please improve the quality of automatic metadata extraction for JSTOR's pdf !!!!!
James Doyle commented
Review, search by title... if you can't fix it, make a stop searching option.
Marcis Gasuns commented
http://archive.org/index.php Would not hurt with it's PDFs at least (eg, http://archive.org/details/akoontalorlostr00monigoog)
Danielle L commented
Wow, people have been aware of this problem for a long time and it is still not fixed. Any comment from Mendeley?
Is it possible to re-enter document details automatically? Just leave "Search by Title" button.
YES !! improve the quality of importation from JSTOR... (and pdf too !)
Eric Moyer commented
The following may be adaptable to better extraction from PDFs:
The code is GPL v3, so it may not work for you, but it their paper is, at least, worth reading.
Carl Edlund Anderson commented
Beyond the improved ability to extract metadata more accurately from 3rd-party resources (whether from the given journal's or publisher's web site, Google Books, or other databases), it seems to me like Mendeley ought to do a better job of mining its own store of metadata, whether stored locally within the given user's local database or from the whole aggregated store of all Mendeley users. It seems like it would be perfectly possible for Mendeley to compare the metadata for a given item in a local database for similarities with metadata for other items in its local or remote databases, and either automatically or upon request suggest alterations additions; e.g. your entry for a given article might be missing an ISSN number, but Mendeley might guess that your entry was the same entry as 20 other people who had metadata for what looked like the same article with the inclusion of the ISSN number, and it could ask you if you wanted to add the ISSN number to your local entry's metadata, etc.
That shouldn't be that hard, I should think ....? And potentially very useful!
If the metadata extraction is improved, will this also give further options in the file rename feature?
Philipp Stachwitz commented
The number of votes and comments shows that this is the major problem of Mendeley. Incorrect Metadata was the reason to leave Mendeley and go back to a the commercial product where I came from (Papers for Mac). Why isn't it at least possible to add the correct metadata, when it's available like in so many cases, when publishers are providing it.? BMJ for example.
First of all improve que quality of automatic metadata extraction for files I create (.doc of .pdf). I enterd métadata when I created a pdf odr a .doc files (author, title tag,... in document properties) but Mendeley don't recognize these métadata :(
Simone Hochgreb commented
Springer link documents are not being picked up. I tried with page:
and it prompts me to fill out the boxes rather than extracting it.
It may have something to do with the fact that this is a chinese server - I had to remove the block from the browser to allow it to save it to Mendeley.
The "Needs Review" papers can be reduced if some simple procedure can be used.
E.g., I often have some arxiv papers, which Mendeley has alreday retrived the arxiv number. If Mendeley can also do an automatic arxiv number look up, which most times will bring up the DOI number and then do an DOI number look up, then the whole metadata is immediate available.
So please allow the user to apply certain rules to the metadata retriving -- just like you can apply some rule to your email in some email management software like Thunderbird or Outlook.