Subscribe here to receive the Data Science Roundup every Sunday morning.

The Truthful Art

The “Rules” of Data Visualization Get an Update | National Geographic

Geoff McGhee talks with Alberto Cairo, author and Knight Chair in Visual Communication at the University of Miami, about his forthcoming book titled The Truthful Art: Data, Charts, and Maps for Communication. The book is a follow up to Cairo’s first book, The Functional Art, in which he tied “together many strands into a compelling and easy-to-follow guide to data visualization and information graphics.” McGhee and Cario discuss the “confluence of factors” that are currently driving the popularization of data visualization, and how understanding basic statistical concepts is essential in order to “use visualization to explain information to the public in an understandable way.”

RJMetrics Data Science Roundup: @mcgeoff & @albertocairo discuss the new “rules” of data visualization: https://goo.gl/Su9Bf9

Machine Learning Isn’t Data Science

Machine Learning Isn’t Data Science | Medium

Nwokedi Idika, Senior Research Scientist at Shape Security, used to think that “Data Science was just some new faddish word for Machine Learning.” However, over time he came to appreciate how they are different. Idika explains the set of techniques that go into machine learning, as well as the steps in the data science process. Ultimately, Idika believes that “Machine Learning isn’t a necessary condition of Data Science.” But rather, “Machine Learning is a type of analysis you *might* perform as part of Data Science.”

RJMetrics Data Science Roundup: @nwokedi explains why ML isn’t data science via @Medium: https://goo.gl/Su9Bf9

1511-all-web-ads_Blog Post Ad

Fighting Disease Outbreaks with Data

Doctors Without Borders Fights Outbreaks with New Tech (And Paper) | FiveThirtyEight

Rebecca Grais, Director of an epidemiology unit with Doctors Without Borders, joins Jody Avirgan on FiveThirtyEight’s What’s The Point podcast to discuss how data is used to help areas that are faced with some of the most serious disease and poverty related challenges in the world. Grais uses Niger, one of the poorest countries in the world, as an example of how her unit “tackles these complex problems with a blend of advanced technology and simpler tools. While epidemiological research employs modeling and statistical analysis, in the end much of the effort comes down to basic paperwork — on actual paper forms filled out in rural homes and hospitals throughout the country.”

RJMetrics Data Science Roundup: Listen to why @MSF fights outbreaks w/ data (& paper) @jodyavirgan @FiveThirtyEight https://goo.gl/Su9Bf9

Jupyter Notebook Tricks

Advanced Jupyter Notebook Tricks – Part I | Domino

The team at the Domino Data Lab loves using Jupyter notebooks “for experimenting with new ideas or data sets” and “interactive exploratory analysis,” but admit that it’s easy to overlook additional “powerful features and use cases.” In Part I of a two-part post, they walk through how to use Jupyter to create “pipelines and reports.” In Part II, they will explore how to create interactive dashboards.

RJMetrics Data Science Roundup: New tricks with Jupyter notebooks from @DominoDataLab https://goo.gl/Su9Bf9

The Data Science Language Wars

The problem with the data science language wars | Wes McKinney

Wes McKinney, of Cloudera, shares his perspective on “the assorted ‘Python vs R’ click-bait articles and Hacker News posts.” McKinney says: “The worst part of the superficial “R vs Python” articles is that they’re adding noise where there ought to be more signal about some of the real problems facing the data science community.”

RJMetrics Data Science Roundup: @wesmckinn on the problem with the data science language wars: https://goo.gl/Su9Bf9

Deep Learning Goes Virus Hunting

Baidu, the ‘Chinese Google,’ is Teaching AI to Spot Malware | Wired

Cade Metz reports on how the big Internet giants are using neural networks and deep learning in a variety of ways. Metz talks with Andrew Ng about how he is using deep neural nets to help guide the security software of the Chinese internet giant, Baidu. “Just as a neural net can learn to recognize a face, it can learn to identify a virus.” According to Metz, the research points to a new trend in the world of deep learning where a “tiny agent” put on user phones “can identify malware without calling back to the data center.” Metz elaborates: “Typically, this is not how a deep learning service works. It operates in two stages—training and execution—but both stages happen in the data center, tapping into a vast network of machines. (This is why Google Now doesn’t work when you’re not connected to the Internet.) But Researchers are now working to hone the execution stage so that it can run on phones, even without an Internet connection.” Ultimately, Metz makes the point that, “AI is no longer a niche pursuit. It’s just part of how we compute—or, indeed, how we live.”

RJMetrics Data Science Roundup: @CadeMetz talks with @AndrewYNg on how Baidu is using deep learning for security https://goo.gl/Su9Bf9

Each week we surface, summarize, and share the most interesting stories and biggest news from the world of data science. Have articles or podcasts that you think we should be covering in our Data Science Roundup? Send them to editor@rjmetrics.com. If you’re not signed up to receive the Data Science Roundup, subscribe here.

ds-cta