Microsoft’s Purchase of GitHub Leaves Some Scientists Uneasy

June 15, 2018 at 11:04AM
via Scientific American Content: Global

GitHub—a website that has become popular with scientists collaborating on research data and software—is to be acquired by Microsoft for US$7.5 billion. In the wake of the takeover announcement on 4 June, some scientists and programmers voiced concerns about the deal on social media. They fear that the site will become less open, or less useful for sharing and tracking scientific data, after the buyout. But others are hopeful that Microsoft’s stewardship will make the platform even more valuable.

GitHub launched in 2008, and is now widely used to store, share and update data sets and software code. As of June 13, more than 223,000 academic papers on Google Scholar cited the website, which is free to use for projects that release their code. One of the features that sets GitHub apart from many similar websites is its use of version-control software known as Git, which transparently records changes to files. This allows programmers in different locations to work on the same project in real time, and to track changes and merge updated data. During the 2014–16 Ebola outbreak in West Africa, for example, researchers used the platform to share and cross-check daily patient counts.


Although Microsoft says GitHub will remain open to any project, some scientists are sceptical about that commitment. “Open Science is not compatible with one corporation owning the platform used to collaborate on code. I hope that expert coders in #openscience have a viable alternative to #github,” tweeted Tom Johnstone, a cognitive neuroscientist at the University of Reading, UK.


Björn Grüning, a bioinformatician at the University of Freiburg in Germany, says some researchers are wary of Microsoft because the company has been slow to make its own tools available in open-source code, or to make its services compatible with open-source projects. He has several projects on GitHub, but says he will move them to another service if the company makes the platform less open, forces Microsoft tools on users or changes its pricing model.


Mahmood Zargar, who studies open-source communities at the Free University of Amsterdam in the Netherlands, is more concerned that Microsoft will impose changes that make GitHub less efficient for him to use. He’s planning to move several projects to other services.


A spokesperson for Microsoft did not answer Nature’s questions about researchers’ concerns, but referred to a blogpost by company chief executive Satya Nadella. “We recognize the responsibility we take on with this agreement,” Nadella wrote. “We are committed to being stewards of the GitHub community, which will retain its developer-first ethos, operate independently and remain an open platform.” The post also states that the company will listen to developers’ feedback and invest in both fundamental features and new capabilities.


Unconcerned


Some researchers say that fears about Microsoft’s acquisition of the platform are overblown. “I’m not convinced that Microsoft owning GitHub is that big a deal to busy researchers,” says Arfon Smith, a data-science manager at the Space Telescope Science Institute in Baltimore, Maryland, and a former GitHub programme manager. Smith, who began using the platform for his own research in 2009 and has more than 200 projects there, doesn’t think Microsoft will change the collaborative features that researchers care about, such as its ease of use.


Other scientists, such as Ruibang Luo, a bioinformatician at the University of Hong Kong, think Microsoft will use its resources to boost the platform’s user numbers, which would increase the number of potential collaborators. “Satya Nadella has done a good job opening up Microsoft’s products to competitors’ platforms,” he says. “So I’m willing to believe it’s a great deal, unless they prove me wrong.” Katy Huff, a nuclear engineer at the University of Illinois at Urbana-Champaign, also thinks GitHub will give Microsoft an opportunity to support science.


Decentralized systems


Daniel Himmelstein, a data scientist at the University of Pennsylvania in Philadelphia, says that GitHub is problematic for researchers, but that this has nothing to do with the Microsoft acquisition.


GitHub hosts repositories of code or data created by the open-source Git, which can be distributed among users, so the repositories themselves can still have backups if a server dies. However, certain information, such as comments on projects and requests to add code, are stored on GitHub’s website. Some of these data are an important part of the scientific record, says Himmelstein, but they are at risk from outages, surveillance or censorship. “Regardless of the Microsoft acquisition, GitHub, as a centralized and closed company, possesses a dangerous level of control over the open-source ecosystem,” he says.


Scientists face fewer threats, says Himmelstein, if they put their work on decentralized hosting systems, such as the git-ssb project, which don’t have a single point of failure. “To the extent that the Microsoft acquisition makes people aware of the centralized nature of GitHub,” he says, “that’s a positive thing.”


This article is reproduced with permission and was first published on June 15, 2018.