{"id":1009,"date":"2014-02-24T02:33:18","date_gmt":"2014-02-24T02:33:18","guid":{"rendered":"http:\/\/wp.sigmod.org\/?p=1009"},"modified":"2015-07-22T00:37:36","modified_gmt":"2015-07-22T00:37:36","slug":"systems-databases-lets-break-down-the-walls","status":"publish","type":"post","link":"https:\/\/wp.sigmod.org\/?p=1009","title":{"rendered":"Systems &#038; Databases: Let\u2019s Break Down the Walls"},"content":{"rendered":"<table style=\"text-align: left; width: 100%; font-family: Verdana;\" border=\"0\" cellspacing=\"2\" cellpadding=\"2\">\n<tbody>\n<tr>\n<td style=\"vertical-align: top;\">\nAfter hanging out exclusively with the database community for 35+ years, I\u2019ve recently become more involved with the systems research community. I have a few observations and recommendations to share.Much of the work published in systems conferences covers topics that would have a natural home in database conferences. For example, transactions and data streams are currently in vogue in the systems field. Here are some recent Best-Paper-award topics: data streams at SOSP 2013, data-parallel query processing at OSDI 2008, and transactions OSDI 2012 and SOSP 2007. Although this topic overlap is increasing lately, it is not just a recent phenomenon. In the last 10+ years, the systems community has taken an interest in data replication, fault tolerance, key-value stores, data-parallel processing (i.e., map-reduce), query processing in sensor networks, and record caching. Some of this work has had a major impact on the database field, e.g., map-reduce and multi-master replication.<span style=\"color: #003578;\"><b>Yet despite this overlap of topics, there has been remarkably little attendance of database researchers at systems conferences or of systems researchers at database conferences. Not a good thing. And until 2013, I\u2019ve been part of the problem.<\/b><br \/>\n<\/span><\/p>\n<p>To help break down this barrier, SIGMOD and SIGOPS have sponsored the annual Symposium on Cloud Computing (SoCC) since 2010. Personally, I\u2019ve found attending SoCC\u2019s to be time well spent, and I had a good experience attending ICDCS 2013 in July. So I decided to attend my first SOSP in October, 2013. It was great. The fraction of that program of direct interest to me was as large as any database conference, including work on transactions, data streams, replication, caching, fault tolerance, and scalability. I knew there wouldn\u2019t be a huge number of database attendees, but I wasn\u2019t prepared to be such an odd duck. At most a dozen attendees out of 600 were card-carrying members of the database community. With few database friends to hang out with, I got to meet a lot more new people than I would at a database event and hear about projects in my areas that I\u2019d wouldn\u2019t otherwise have known about. Plus, I could advertise my own work to another group of folks.<\/p>\n<p>If you do systems-oriented database research, then how about picking one systems conference to attend this year? NSDI in Seattle in April, ICDCS in Madrid in late June, or OSDI in Colorado in October. And if you see a systems researcher looking lonely at a database conference, then strike up a conversation. If they feel welcome, perhaps they\u2019ll tell others and we\u2019ll see more systems attendees at database events.<\/p>\n<p>Given the small overlap of attendees in systems and database conferences, we shouldn\u2019t be surprised that the communities have developed different styles of research papers. Since systems papers typically describe mechanisms lower on the stack, they often focus on broader usage scenarios. They value the \u201cities\u201d more than database papers, e.g., scalability, availability, manageability, and security. They favor simple ideas that work robustly over complex techniques that report improvements for some inputs. They expect a paper to work through the system details in a credible prototype, and to report on lessons learned that are more broadly applicable. They expect more micro-benchmarks that explain the source of performance behavior that\u2019s observed, rather than just benchmarks that model usage scenarios, such as TPC.<\/p>\n<p><b> From attending systems conferences, serving on SoCC program committees (PC\u2019s), and submitting an ill-fated paper to a systems conference, I learned a few things about systems conferences that we database folks could learn from:<\/b><\/p>\n<p><b> 1. More of their conferences are single-track, and they leave more time for Q&amp;A.<\/b> This leads to more-polished presentations. When you\u2019re presenting a paper to 600 attendees, doing it badly is a career-limiter.<\/p>\n<p>I\u2019ve always liked CIDR, not only because of its system-building orientation, but also because it\u2019s single-track. I\u2019m forced to attend sessions on topics I normally would ignore, and hence I learn more. I think we should try making one of our big conferences single-track: SIGMOD, VLDB or ICDE. One might argue that too many excellent papers are submitted, so it\u2019s impractical. I disagree. Here\u2019s one way to do it: Have the same acceptance rate as in the past, and all papers are published as usual. However, the PC selects only one-track\u2019s worth of papers for presentation slots. The other papers are presented in poster sessions that have no competing parallel sessions. The single-track enables us to learn about more topics, the density of great presentations is higher, and the poster sessions enable us to dig deep on papers of interest to us (as some of our DB conferences already enable us to do). Unless our field stops growing, we really have to do something like this. Our conferences have already gotten out of hand with as many as seven parallel sessions.<\/p>\n<p><b> 2. When describing related work, systems papers cast a wider net.<\/b> Their goal is often not simply to demonstrate that the paper\u2019s contributions are novel, but also to educate the reader about lines of work that are loosely related to the paper\u2019s topic. To help enable this breadth, they\u2019ve recently settled on a formula where references don\u2019t count toward the paper\u2019s maximum page count. We should adopt this view in the database community. One self-serving benefit is that papers would cite more references, which would increase all of our citation counts.<\/p>\n<p><b> 3. There\u2019s a widespread belief that systems PC\u2019s write longer, more constructive reviews.<\/b> I think DB PC\u2019s have been improving in this respect, but on average we\u2019re not up to the systems\u2019 standard.<\/p>\n<p><b> 4. They usually shepherd all papers, <\/b> to ensure that authors incorporate recommended changes. This also gives the authors someone to ask about ambiguous recommendations. We should do this too.<\/p>\n<p><b> 5. They typically require 10pt font.<\/b> As a courtesy to those of us over a certain age, whose eyes aren\u2019t what they used to be, we should do this, and increase the page count to maintain the same paper length.<\/p>\n<p><b> 6. At SOSP and OSDI, they post all slide decks and videos of all presentations.<\/b> Obviously worthwhile. We sometimes post slide decks and videos only for plenaries. Let\u2019s do better.<\/p>\n<p><b> There are a few aspects of systems conferences that I\u2019m less enthusiastic about. <\/b><\/p>\n<p><b> 1. Systems conferences favor live PC meetings. <\/b> They do have benefits. It\u2019s educational for PC members to hear discussions of papers they didn\u2019t review and for young researchers to see how decisions are reached. Since all PC members hear every paper discussion, non-reviewers can bring up points that neutralize inappropriate criticisms or raise issues that escaped reviewers\u2019 attention, which reduces the randomness of decisions. However, there are also negatives. In borderline cases, the most articulate, quick-thinking, and extroverted PC members have an inappropriate advantage in getting their way. And inevitably, some PC members can\u2019t attend the meeting, due to a schedule conflict or the travel expense, so the papers they reviewed get short shrift.<\/p>\n<p>On balance, I don\u2019t think the decisions produced by live PC meetings are enough better than on-line discussions to be worth what they cost. Perhaps we can get some of the benefits of a PC meeting by allowing PC members to see the reviews and discussions of all papers for which they don\u2019t have a conflict, late in the discussion period (which we did some years ago). With large PC\u2019s, this might constitute public disclosure, in which case patents would have to be filed before submission, which will delay the publication of some work. But if we think a broader vetting of papers is beneficial, this might be worth trying.<\/p>\n<p><b> 2. During the reviewing process of a systems conference, if a paper seems borderline, then the PC chairs solicit more reviews. <\/b> These reviews do offer a different perspective on the interest-value of the work. But they are usually not as detailed as the initial ones and may be by less-expert reviewers. It\u2019s common to receive 5 or 6 reviews of a paper submitted to a systems conference. As an author it\u2019s nice to get this feedback. And it might reduce the randomness of decisions. But it significantly increases the reviewing load on PC\u2019s. In database PC\u2019s, we rarely, if ever, do this. In my opinion, it\u2019s usually better to hold reviewers\u2019 feet to the fire and make them decide. Still, we\u2019d benefit from getting 4 or 5 reviews more often than never, but not as often as systems conferences.<\/p>\n<p><b> 3. Systems conferences publish very few industry papers and rarely have an industry track. <\/b> In my experience, an industry paper is judged just like a research paper, but with a somewhat lower quality bar. This is unfortunate. Researchers and practitioners benefit from reading about the functionality and internals of state-of-the-art products, even if they are only modestly innovative, especially if they are widely used. The systems community would serve their audience better by publishing more such papers.<\/p>\n<p><b> 4. The reviewing load for systems PC\u2019s is nearly double that of DB PC\u2019s, so systems PC\u2019s are proportionally smaller. <\/b> E.g., SOSP 2013 had 160 submissions and 28 PC members. If each paper had three reviews, that\u2019s 17 reviews per PC member. With five reviews\/paper, that\u2019s 29 reviews per PC member. If DB conferences cranked up the reviewing load, then we\u2019d probably participate in fewer PC\u2019s, so the workload on each of us might be unchanged. A PC member would see more of the submissions in each area of expertise, which might help reduce the randomness of decisions \u2026 maybe. Overall, I don\u2019t see a compelling argument to change, but it\u2019s debatable.<\/p>\n<p><b> 5. Like some DB conferences, many systems conferences have adopted double-blind reviewing. <\/b> I am not a fan. For papers describing a system project, I find it hard to review and, in some cases, impossible to write a paper for a double-blind conference. I have chosen not to submit some papers to double-blind conferences. I know I am not alone in this.<\/p>\n<p>In summary, the systems and DB areas have become much closer in recent years. Each community has something to learn from the other area\u2019s technical contributions and point-of-view and from its approach to conferences and publications. We really would benefit from talking to each other a lot more than we do.<\/p>\n<p><i>Acknowledgments:<\/i> My thanks to Gustavo Alonso, Surajit Chaudhuri, Sudipto Das, Sameh Elnikety, Mike Franklin, and Sergey Melnik for contributing some of the ideas in this blog post.<\/td>\n<\/tr>\n<tr>\n<td style=\"vertical-align: top;\">\n<p><span style=\"font-family: cambria;\"> <strong> Blogger&#8217;s Profile: <\/strong> <em><br \/>\n<a href=\"http:\/\/research.microsoft.com\/en-us\/people\/philbe\/\" target=\"_blank\">Philip A. Bernstein <\/a> is a Distinguished Scientist at Microsoft Research. Over the past 35 years, he has been a product architect at Microsoft and Digital Equipment Corp., a professor at Harvard University and Wang Institute of Graduate Studies, and a VP Software at Sequoia Systems. During that time, he has published <a href=\"http:\/\/www.informatik.uni-trier.de\/~ley\/pers\/hd\/b\/Bernstein:Philip_A=.html\" target=\"_blank\">papers <\/a> and two books on the theory and implementation of database systems, especially on transaction processing and data integration, which are still the major focus of his research. He is an ACM Fellow, a winner of the ACM SIGMOD Innovations Award, a member of the Washington State Academy of Sciences and a member of the National Academy of Engineering. He received a B.S. degree from Cornell and M.Sc. and Ph.D. from University of Toronto.<br \/>\n<\/em> <\/span><\/p>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<div>\n<p> Copyright @ 2014, Philip A. Bernstein, All rights reserved.<\/p>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>After hanging out exclusively with the database community for 35+ years, I\u2019ve recently become more involved with the systems research community. I have a few observations and recommendations to share.Much of the work published in systems conferences covers topics that would have a natural home in database conferences. For example, transactions and data streams are [&hellip;]<\/p>\n","protected":false},"author":19,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[16],"tags":[],"coauthors":[],"class_list":["post-1009","post","type-post","status-publish","format-standard","hentry","category-databases"],"views":884,"_links":{"self":[{"href":"https:\/\/wp.sigmod.org\/index.php?rest_route=\/wp\/v2\/posts\/1009","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/wp.sigmod.org\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/wp.sigmod.org\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/wp.sigmod.org\/index.php?rest_route=\/wp\/v2\/users\/19"}],"replies":[{"embeddable":true,"href":"https:\/\/wp.sigmod.org\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=1009"}],"version-history":[{"count":36,"href":"https:\/\/wp.sigmod.org\/index.php?rest_route=\/wp\/v2\/posts\/1009\/revisions"}],"predecessor-version":[{"id":1012,"href":"https:\/\/wp.sigmod.org\/index.php?rest_route=\/wp\/v2\/posts\/1009\/revisions\/1012"}],"wp:attachment":[{"href":"https:\/\/wp.sigmod.org\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=1009"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/wp.sigmod.org\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=1009"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/wp.sigmod.org\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=1009"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/wp.sigmod.org\/index.php?rest_route=%2Fwp%2Fv2%2Fcoauthors&post=1009"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}