10 Trending Data Science Topics at ODSC East 2019
ODSC East 2019 is honored to host over 300+ of the leading experts in data science and artificial intelligence. In this evolving field new topics, models, frameworks, and techniques are constantly emerging. Here are a few standout topics and presentations to keep an eye on:
Lukas Biewald has founded a host of data science startups and is historically one of ODSC’s highest-rated instructors. His topic on LSTMs, a type of recurrent neural network, is popular with many of our attendees.
In this introduction, Lukas will discuss LSTM networks, which are picking up speed due to their ability to learn long term dependencies are applicable to a number of sequence learning problems including language modeling and translation, among other applications. Lukas will focus on using Keras and TensorFlow to build a language classifier, generator, and a translating sequence to sequence model.
PyMC3 is a probabilistic programming package for Python that allows users to fit Bayesian models using a variety of numerical methods, like MCMC and VI. Its flexibility and extensibility make it applicable to a large suite of problems across countless industries, such as astronomy, molecular biology, crystallography, chemistry, ecology, psychology, and more.
In this workshop, Austin will discuss PyMC3 core model specification and fitting functionality, such as its functionality for summarizing output and model diagnostics.
3. Sequence to Sequence Models in PyTorch, Mat Leonard, PhD, Product Lead, Udacity
The PyTorch open-source machine learning library, built on an idea of the provision of fast and flexible modeling experience, is quickly becoming the go-to for machine learning and NLP pros, with big names like Facebook and Uber contributing to its resources. Its community is growing quickly, and improvements and updates come frequently.
In this talk, Mat will demonstrate how to build sequence to sequence models in PyTorch, covering techniques such as attention, teacher forcing, and curriculum learning which improve the performance of models and decrease training time.
Much of Ben’s research and work revolves around innovative techniques behind machine learning and machine intelligence. His accomplishments are impressive; Ben has authored over 120 patents and academic publications, and his work has been featured in the Wall Street Journal, New York Times, EE Times, Scientific American, Wired, TechCrunch, and other media.
In this talk, Ben explains the concept of compositional machine learning, which is the nesting of two or more functions to form a single new function. For example, looking at cats and dogs and being able to recognize specific parts, such as paws, tails, whiskers, and more. By developing a model like this, you will be able to develop much more sophisticated models that can understand more complicated concepts in your data.
[Related article: Chat, Social, and Survey Interactions Using AI Idea Analysis]
Spark is the next evolutionary change in data science environments as it provides batch as well as streaming capabilities, making it a preferred platform for speedy data analysis at scale
This workshop examines some of the newest features in Spark that allow elegant, high-performance integration with your favorite Python tooling. Adam will discuss distributed scheduling for popular libraries like TensorFlow, as well as fast model inference, traditionally a challenge with Spark.
ODSC East and ODSC LiveStream — 30% OFF ends soon!
In mission-critical real-time applications, using machine learning to analyze streaming data is gaining momentum. In those applications, Apache Kafka is the most widely used framework to process the data streams.
Here, Yong will focus on the KafkaDataset module in TensorFlow. KafkaDataset processes Kafka streaming data directly to TensorFlow’s graph. As a part of Tensorflow (in `tf.contrib`), the implementation of KafkaDataset is mostly written in C++. The module exposes a machine learning friendly Python interface through Tensorflow’s `tf.data` API. It could be directly fed to `tf.keras` and other TensorFlow modules for training and inferencing purposes.
7. Accelerate AI Development with Transfer Learning, Anjali Shah, PhD & Steve Geringer, IBM
A technique called transfer learning allows data scientists to increase their productivity dramatically by sharing neural network architectures and model weights. The reuse of a pre-trained model on a different but related task enables training of deep neural networks with comparatively less data.
In this talk, you will learn the details of how transfer learning works and will see demonstrations in both financial and healthcare domains. Anjali and Steve will talk about specific use cases and lessons learned that are applicable to many other industry sectors.
Recommendation engines (also known as recommendation systems) are an integral part of many online platforms and retail companies. You’ve likely encountered countless in the past, such as in an online store’s “You may also be interested in…” or “Customers also purchased…” features. What’s the magic behind the scenes that make them work so well?
To gain the skills needed to develop recommendation engines, participants will use recurrent neural networks to analyze sequential data and improve the forecast performance of time series data and use convolutional neural networks for image classification. Participants will also use a genetic algorithm to efficiently tune the hyperparameters of both deep learning models.
9. Programming with Data: Python and Pandas, Daniel Gerlanc, President, Enplus Advisors Inc.
Open-source libraries like Pandas are essential tools for making data science more accessible to the greater data science community. Python is one of the more popular languages data scientists use due to the availability of tools like pandas and scikit-learn. Thus the two together make the perfect pair.
In this workshop, you will learn how to accelerate your data analyses using the Python language and Pandas, a library specifically designed for interactive data analysis. Pandas is a massive library, so Daniel will focus on its core functionality, specifically, loading, filtering, grouping, and transforming data. Having completed this workshop, you will understand the fundamentals of Pandas, be aware of common pitfalls, and be ready to perform your own analyses.
[Related article: From Pandas to Scikit-Learn — A New Exciting Workflow]
Anyone involved in marketing knows how much work goes into it. Many marketing pros would much rather handle the decision making, strategic side of things, rather than the tedious repetitive emailing and social posting. How great would it be if you had an AI that could do the grunt work for you?
In this talk, Ilya will discuss automatic decision-making and AI techniques for promotional campaigns. First, they will present a methodology to develop highly automated promotion management systems. Next, Ilya will walk through practical examples of how advanced customer and content signals can be generated using predictive models and then be used in the automation of targeting, budgeting, and pricing decisions.
Want to see more? Check out our full list of topics here and attend in person this April 30 to May 3, or watch the live stream from anywhere in the world.
— —
Read more data science articles on OpenDataScience.com, including tutorials and guides from beginner to advanced levels! Subscribe to our weekly newsletter here and receive the latest news every Thursday.