Monday, November 12, 2018

2018-11-12: Google Scholar May Need To Look Into Its Citation Rate



Google Scholar has long been regarded as a digital library containing the most complete collection of scholarly papers and patterns. For a digital library, completeness is very important because otherwise, you cannot guarantee the citation rate of a paper, or equivalently the in-link of a node in the citation graph. That is probably why Google Scholar is still more widely used and trusted than any other digital libraries with fancy functions.

Today, I found two very interesting aspects of Google Scholar, one is clever and one is silly. The clever side is that Google Scholar distinguishes papers, preprints, and slides and count citations of them separately.

If you search "DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs", you may see the same view as I attached. Note that there are three results. The first is a paper on IEEE. The second actually contains a list of completely different authors. These people are probably doing a presentation of that paper. The third is actually a pre-print on arXiv. These three have different numbers of citations, which they should do.


The silly side is also reflected in the search result. How does a paper published in less than a year receive more than 1900 citations? You may say that may be a super popular paper. But if you look into the citations. Some do not make sense. For example, the first paper that "cites" the DeepLab paper was published in 2015! How could it cite a paper published in 2018?

Actually, the first paper's citation rate is also problematic. A paper published in 2015 was cited more than 6500 times! And another paper published in 2014 was cited more than 16660 times!

Something must be wrong about Google Scholar! The good news that the number looks higher, which makes everyone happy! :)


Jian Wu

1 comment:

  1. It's even more complicated... the 2014 version is apparently an entirely different version without the "Deeplab:" portion in the title:

    https://arxiv.org/abs/1412.7062

    there is a 2016 version in arxiv:

    DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs

    https://arxiv.org/abs/1606.00915

    that was apparently published online in 2017 with a DOI:

    https://doi.org/10.1109/TPAMI.2017.2699184

    and then was assigned a volume and number in 2018.

    Google Scholar rolls up the 2016 citations from arxiv, the 2017 citations from "online" first, and then the official 2018 version all into the same version.

    a search in google finds:

    DeepLab: Semantic Image Segmentation with Deep Convolutional ...
    https://arxiv.org › cs

    by LC Chen - ‎2016 - ‎Cited by 1901 - ‎Related articles
    Jun 2, 2016 - DeepLab: Semantic Image Segmentation
    [deletia]

    https://www.google.com/search?q=DeepLab%3A+Semantic+Image+Segmentation+with+Deep+Convolutional+Nets%2C+Atrous+Convolution%2C+and+Fully+Connected+CRFs&ie=utf-8&oe=utf-8&client=firefox-b-1

    ReplyDelete