Pages

Showing posts with label platform. Show all posts
Showing posts with label platform. Show all posts

Tuesday, July 12, 2016

We are joining the Open edX platform



A year ago, we released Course Builder, an experimental platform for online education at scale. Since then, individuals have created courses on everything from game theory to philanthropy, offered to curious people around the world. Universities and non-profit organizations have used the platform to experiment with MOOCs, while maintaining direct relationships with their participants. Google has published a number of courses including Introduction to Web Accessibility which opens for registration today. This platform is helping to deliver on our goal of making education more accessible through technology, and enabling educators to easily teach at scale on top of cloud platform services.

Today, Google will begin working with edX as a contributor to the open source platform, Open edX. We are taking our learnings from Course Builder and applying them to Open edX to further innovate on an open source MOOC platform. We look forward to contributing to edX’s new site, MOOC.org, a new service for online learning which will allow any academic institution, business and individual to create and host online courses.

Google and edX have a shared mission to broaden access to education, and by working together, we can advance towards our goals much faster. In addition, Google, with its breadth of applicable infrastructure and research capabilities, will continue to make contributions to the online education space, the findings of which will be shared directly to the online education community and the Open edX platform.

We support the development of a diverse education ecosystem, as learning expands in the online world. Part of that means that educational institutions should easily be able to bring their content online and manage their relationships with their students. Our industry is in the early stages of MOOCs, and lots of experimentation is still needed to find the best way to meet the educational needs of the world. An open ecosystem with multiple players encourages rapid experimentation and innovation, and we applaud the work going on in this space today.

We appreciate the community that has grown around the Course Builder open source project. We will continue to maintain Course Builder, but are focusing our development efforts on Open edX, and look forward to seeing edX’s MOOC.org platform develop. In the future, we will provide an upgrade path to Open edX and MOOC.org from Course Builder. We hope that our continued contributions to open source education projects will enable anyone who builds online education products to benefit from our technology, services and scale. For learners, we believe that a more open online education ecosystem will make it easier for anyone to pick up new skills and concepts at any time, anywhere.
Read More..

Tuesday, May 24, 2016

Facilitating Genomics Research with Google Cloud Platform



The understanding of the origin and progression of cancer remains in its infancy. However, due to rapid advances in the ability to accurately read and identify (i.e. sequence) the DNA of cancerous cells, the knowledge in this field is growing rapidly. Several comprehensive sequencing studies have shown that alterations of single base pairs within the DNA, known as Single Nucleotide Variants (SNVs), or duplications, deletions and rearrangements of larger segments of the genome, known as Structural Variations (SVs), are the primary causes of cancer and can influence what drugs will be effective against an individual tumor.

However, one of the major roadblocks hampering progress is the availability of accurate methods for interpreting genome sequence data. Due to the sheer volume of genomics data (the entire genome of just one person produces more than 100 gigabytes of raw data!), the ability to precisely localize a genomic alteration (SNV or SV) and resolve its association with cancer remains a considerable research challenge. Furthermore, preliminary benchmark studies conducted by the International Cancer Genome Consortium (ICGC) and The Cancer Genome Atlas (TCGA) have discovered that different mutation calling software run on the same data can result in detection of different sets of mutations. Clearly, optimization and standardization of mutation detection methods is a prerequisite for realizing personalized medicine applications based on a patient’s own genome.

The ICGC and TCGA are working to address this issue through an open community-based collaborative competition, run in conjunction with leading research institutions: the Ontario Institute for Cancer Research, University of California Santa Cruz, Sage Bionetworks, IBM-DREAM, and Oregon Health and Sciences University. Together, they are running the DREAM Somatic Mutation Calling Challenge, in which researchers from across the world “compete” to find the most accurate SNV and SV detection algorithms. By creating a living benchmark for mutation detection, the DREAM Challenge aims to improve standard methods for identifying cancer-associated mutations and rearrangements in tumor and normal samples from whole-genome sequencing data.

Given Google’s recent partnership with the Global Alliance for Genomics and Health, we are excited to provide cloud computing resources on Google Cloud Platform for competitors in the DREAM Challenge, enabling scientists who do not have ready access to large local computer clusters to participate with open access to contest data as well as credits that can be used for Google Compute Engine virtual machines. By leveraging the power of cloud technologies for genomics computing, contestants have access to powerful computational resources and a platform that allows the sharing of data. We hope to democratize research, foster the open access of data, and spur collaboration.

In addition to the core Google Cloud Platform infrastructure, the Google Genomics team has implemented a simple web-based API to store, process, explore, and share genomic data at scale. We have made the Challenge datasets available through the Google Genomics API. The challenge includes both simulated tumor data for which the correct answers are known and real tumor data for which the correct answers are not known.
Genomics API Browser showing a particular cancer variant position (highlighted) in dataset in silico #1 that was missed by many challenge participants.
Although submissions for the simulated data can be scored immediately, the winners on the real tumor data will not immediately be known when the challenge closes. This is a consequence of the fact that current DNA sequencing technology does not provide 100% accurate data, which adds to the complexity of the problem these algorithms are attempting to tackle. Therefore, to identify the winners, researchers must turn to alternative laboratory technologies to verify if a particular mutation that was found in sequencing data is actually (or likely) to be true. As such, additional data will be collected after the Challenge is complete in order to determine the winner. The organizers will re-sequence DNA from the cells of the real tumor using an independent sequencing technology (Ion Torrent), specifically examining regions overlapping the positions of the cancer mutations submitted by the contest participants.

As an analogy, a "scratched magnifying glass" is used to examine the genome the first time around. The second time around, a "stronger magnifying glass with scratches in different places" is used to look at the specific locations in the genome reported by the challenge participants. By combining the data collected by those two different "magnifying glasses", and then comparing that against the cancer mutations submitted by the contest participants, the winner will then be determined.

We believe we are at the beginning of a transformation in medicine and basic research, driven by advances in genome sequencing and computing at scale. With the DREAM Challenge, we are all excited to be part of bringing researchers around the world to focus on this particular cancer research problem. To learn more about how to participate in the challenge register here.
Read More..

Sunday, March 13, 2016

Collaborative Mathematics with SageMathCloud and Google Cloud Platform



(cross-posted on the Google for Education blog and Google Cloud Platform blog)

Modern mathematics research is distinguished by its openness. The notion of "mathematical truth" depends on theorems being published with proof, letting the reader understand how new results build on the old, all the way down to basic mathematical axioms and definitions. These new results become tools to aid further progress.

Nowadays, many of these tools come either in the form of software or theorems whose proofs are supported by software. If new tools produce unexpected results, researchers must be able to collaborate and investigate how those results came about. Trusting software tools means being able to inspect and modify their source code. Moreover, open source tools can be modified and extended when research veers in new directions.

In an attempt to create an open source tool to satisfy these requirements, University of Washington Professor William Stein built SageMathCloud (or SMC). SMC is a robust, low-latency web application for collaboratively editing mathematical documents and code. This makes SMC a viable platform for mathematics research, as well as a powerful tool for teaching any mathematically-oriented course. SMC is built on top of standard open-source tools, including Python, LaTeX, and R. In 2013, William received a 2013 Google Research Award which provided Google Cloud Platform credits for SMC development. This allowed William to extend SMC to use Google Compute Engine as a hosting platform, achieving better scalability and global availability.
SMC allows users to interactively explore 3D graphics with only a browser
SMC has its roots in 2005, when William started the Sage project in an attempt to create a viable free and open source alternative to existing closed-source mathematical software. Rather than starting from scratch, Sage was built by making the best existing open-source mathematical software work together transparently and filling in any gaps in functionality.

During the first few years, Sage grew to have about 75K active users, while the developer community matured with well over 100 contributors to each new Sage release and about 500 developers contributing peer-reviewed code.

Inspired by Google Docs, William and his students built the first web-based interface to Sage in 2006, called The Sage Notebook. However, The Sage Notebook was designed for a small number of users and would work for a small group (such as a single class), but soon became difficult to maintain for larger groups, let alone the whole web.

As the growth of new users for Sage began to stall in 2010, due largely to installation complexity, William turned his attention to finding ways to expand Sages availability to a broader audience. Based on his experience teaching his own courses with Sage, and feedback from others doing the same, William began building a new Web-hosted version of Sage that can scale to the next generation of users.

The result is SageMathCloud, a highly distributed multi-datacenter application that creates a viable way to do computational mathematics collaboratively online. SMC uses a wide variety of open source tools, from languages (CoffeeScript, node.js, and Python) to infrastructure-level components (especially Cassandra, ZFS, and bup) and a number of in-browser toolkits (such as CodeMirror and three.js).

Latency is critical for collaborative tools: like an online video game, everything in SMC is interactive. The initial versions of SMC were hosted at UW, at which point the distance between Seattle and far away continents was a significant issue, even for the fastest networks. The global coverage of Google Cloud Platform provides a low-latency connection to SMC users around the world that is both fast and stable. Its not uncommon for long-running research computations to last days, or even weeks -- and here the robustness of Google Compute Engine, with machines live-migrating during maintenance, is crucial. Without it, researchers would often face multiple restarts and delays, or would invest in engineering around the problem, taking time away from the core research.

SMC sees use across a number of areas, especially:

  • Teaching: any course with a programming or math software component, where you want all your students to be able to use that component without dealing with the installation pain. Also, SMC allows students to easily share files, and even work together in realtime. There are dozens of courses using SMC right now.
  • Collaborative Research: all co-authors of a paper can work together in an SMC project, both writing the paper there and doing research-level computations.

Since launching SMC in May 2013, there are already more than 20,000 monthly active users whove started using Sage via SMC. We look forward to seeing if SMC has an impact on the number of active users of Sage, and are excited to learn about the collaborative research and teaching that it makes possible.
Read More..