Pregel: A System for Large-Scale Graph Processing

Grzegorz Malewicz, Matthew H. Austern, Aart J.C. Bik, James C. Dehnert, Ilan Horn, Naty Leiser, and Grzegorz Czajkowski

Pregel is one of the first systems for large-scale graph analytics. It introduced vertex-centric bulk-synchronous processing as programming and computation models for graph analytics. It has generated significant follow-on research that both optimized its computation model and developed alternatives. The paper now has more than 3700 citations on Google Scholar and has become standard reading in graph analytics.

Dr. Greg Malewicz founded — world’s first deep search engine of real estate market based on commute duration (PCT/US2019/017909). He was an engineer at Google and Facebook, where he built graph computing systems. Before Silicon Valley, he was an assistant professor of theoretical computer science in Alabama. Greg authored dozens of scientific papers and patents, including a singly-authored paper that solves a decade-old parallel computing problem (SICOMP’05). He became a US Citizen through the Outstanding Researcher category of first preference. Greg holds BA and MS degrees from UWarsaw, and a PhD degree from UConn with his last year at MIT. Greg visited hundreds of UNESCO World Heritage Sites, and cycled an 18,000ft Himalayan pass.
Matthew Austern is Principal Software Engineer at Google, where his focus is large scale distributed computation. He completed an SB from MIT and a PhD in physics from UC Berkeley, and has subsequently worked at SGI, AT&T Research, and Apple. Matt has also published on generic programming, and contributed to the development of the C++ language and the Standard Template Library. He is currently technical lead for Google’s event logging data warehouse.
Aart J.C. Bik received his MS degree in computer science (cum laude) from the Utrecht University in 1992 and his PhD degree from the Leiden University in 1996. As a Principal Engineer at Intel, he was the lead compiler architect of automatic vectorization in the Intel C++/Fortran compilers. In 2002, Aart received the Intel Achievement Award (highest company award) for making SSE easy to use through automatic vectorization. In 2007, Aart moved to Google, where he has worked on Pregel, a distributed system for large-scale graph computations, on Google Glass, and on optimizing compilers for the Android Runtime and the Dart VM. He is currently working in Google Brain on MLIR and LLVM.
Jim Dehnert received his Ph.D. in Applied Mathematics from U.C. Berkeley in 1983, advised by Susan L. Graham. After graduation, he worked primarily on compiler code generation and optimization for a series of companies, including teams doing an early Ada compiler at ROLM, the first commercial software pipeline code generator at Cydrome, the MIPSpro 64-bit compilers at SGI, and the Code-Morphing System at Transmeta. At Google, he worked on Pregel, cloud computing, and new architectures. He is now retired, and happy that birding provides a good reason to be outdoors hiking on a regular basis.
Ilan Horn is a software Engineer for over 20 years, from Start-up companies all the way to Google, developing and researching infrastructure. Left the field to become a wildlife Photographer a few years ago, hoping to spark interest in the preservation of our fragile environment by showing its beauty through my photos.
Naty Leiser is a Software Engineer at Google, where he works on large-scale distributed infrastructure for handling big data in an efficient and policy compliant manner, by many of Google’s major products. Before joining Google, Naty worked at Intel (2003-2006) on a Data Warehouse system used for BI solutions. He received his BA in Computer Science from the Technion – Israel Institute of Technology (2006).
Grzegorz Czajkowski is the SVP of Engineering at Snowflake Inc., responsible for the development of Snowflake’s cutting edge cloud data platform. Prior to Snowflake, Grzegorz worked at Google. He began as an engineer on cluster management, and left as VP of Engineering, leading the development of data analytics services for internal Google needs as well as for cloud customers. His first job was with Sun Microsystems, where he worked on various aspects of the Java Virtual Machine, and received the ACM OOPSLA ten year award for a paper on the Multi-Tasking Virtual Machine. Grzegorz holds a Ph.D. from Cornell University and M.Sc. from AGH (Krakow, Poland), both in Computer Science.