Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds

How-To Tutorials - Artificial Intelligence

86 Articles
article-image-stack-overflow-developer-survey-2018-quick-overview
Amey Varangaonkar
14 Mar 2018
4 min read
Save for later

Stack Overflow Developer Survey 2018: A Quick Overview

Amey Varangaonkar
14 Mar 2018
4 min read
Stack Overflow recently published their annual developer survey in which over 100,000 developers and professionals participated. The survey shed light on some very interesting insights - from the developers’ preferred language for programming, to the development platform they hate the most. As the survey is quite detailed and comprehensive, we thought why not present the most important takeaways and findings for you to go through very quickly? If you are short of time and want to scan through the results of the survey quickly, read on.. Developer Profile Young developers form the majority: Half the developer population falls in the age group of 25-34 years while almost all respondents (90%) fall within the 18 - 44 year age group. Limited professional coding experience: Majority of the developers have been coding from the last 2 to 10 years. That said, almost half of the respondents have a professional coding experience of less than 5 years. Continuous learning is key to surviving as a developer: Almost 75% of the developers have a bachelor’s degree, or higher. In addition, almost 90% of the respondents say they have learnt a new language, framework or a tool without taking any formal course, but with the help of the official documentation and/or Stack Overflow Back-end developers form the majority: Among the top developer roles, more than half the developers identify themselves as back-end developers, while the percentage of data scientists and analysts is quite low. About 20% of the respondents identify themselves as mobile developers Working full-time: More than 75% of the developers responded that they work a full-time job. Close to 10% are freelancers, or self-employed. Popularly used languages and frameworks The Javascript family continue their reign: For the sixth year running, JavaScript has continued to be the most popular programming language, and is the choice of language for more than 70% of the respondents. In terms of frameworks, Node.js and Angular continue to be the most popular choice of the developers. Desktop development ain’t dead yet: When it comes to the platforms, developers prefer Linux and Windows Desktop or Server for their development work. Cloud platforms have not gained that much adoption, as yet, but there is a slow but steady rise. What about Data Science? Machine Learning and DevOps rule the roost: Machine Learning and DevOps are two trends which are trending highly due to the vast applications and research that is being done on these fronts. Tensorflow rises, Hadoop falls: About 75% of the respondents love the Tensorflow framework, and say they would love to continue using it for their machine learning/deep learning tasks. Hadoop’s popularity seems to be going down, on the other hand, as other Big Data frameworks like Apache Spark gain more traction and popularity. Python - the next big programming language: Popular data science languages like R and Python are on the rise in terms of popularity. Python, which surpassed PHP last year, has surpassed C# this year, indicating its continuing rise in popularity. Python based Frameworks like Tensorflow and pyTorch are gaining a lot of adoption. Learn F# for more moolah: Languages like F#, Clojure and Rust are associated with high global salaries, with median salaries above $70,000. The likes of R and Python are associated with median salaries of up to $57,000. PostgreSQL growing rapidly, Redis most loved database: MySQL and SQL Server are the two most widely used databases as per the survey, while the usage of PostgreSQL has surpassed that of the traditionally popular databases like MongoDB and Redis. In terms of popularity, Redis is the most loved database while the developers dread (read looking to switch from) databases like IBM DB2 and Oracle. Job-hunt for data scientists: Approximately 18% of the 76,000+ respondents who are actively looking for jobs are data scientists or work as academicians and researchers. AI more exciting than worrying: Close to 75% of the 69,000+ respondents are excited about the future possibilities with AI than worried about the dangers posed by AI. Some of the major concerns include AI making important business decisions. The big surprise was that most developers find automation of jobs as the most exciting part of a future enabled by AI. So that’s it then! What do you think about the Stack Overflow Developer survey results? Do you agree with the developers’ responses? We would love to know your thoughts. In the coming days, watch out for more fine grained analysis of the Stack Overflow survey data.
Read more
  • 0
  • 0
  • 4470

article-image-uber-ai-labs-senior-research-scientist-ankit-jain-tensorflow-updates-learning-machine-learning
Sugandha Lahoti
19 Dec 2019
10 min read
Save for later

Uber AI Labs senior research scientist, Ankit Jain on TensorFlow updates and learning machine learning by doing [Interview]

Sugandha Lahoti
19 Dec 2019
10 min read
No doubt, TensorFlow is one of the most popular machine learning libraries right now. However, newbie developers who want to experiment with TensorFlow often face difficulties in learning TensorFlow, relying just on tutorials.  Recently, we sat down with Ankit Jain, senior research scientist at Uber AI Labs and one of the authors of the book, TensorFlow Machine Learning Projects. Ankit talked about how real-world implementations can be a good way to learn for those developing TF models, specifically the ‘learn by doing’ approach. Talking about TensorFlow 2.0, he considers ‘eager execution by default’ a major paradigm shift and is all game for interoperability between TF 2.0 and other machine learning frameworks. He also gave us an insight into the limitations of AI algorithms (generalization, AI ethics, labeled data to name a few). Continue reading the full interview for a detailed perspective. On why TensorFlow 2 upgrade is paradigm-shifting in more ways than one TensorFlow 2 was released last month. What are some of your top features in TensorFlow 2.0? How do you think it has upgraded the machine learning ecosystem? TF 2.0 is a major upgrade from its predecessor in many ways. It addressed many of the shortcomings of TF 1.x and with this release, the difference between Pytorch and TF has narrowed. One of the biggest paradigm shifts in TF 2.0 is eager execution by default. This means you don’t have to pre-define a static computation graph, create sessions, deal with the unintuitive interface or have painful experience in debugging your deep learning model code. However, you lose on some performance in run time when you switch to complete eager mode. For that purpose, they have introduced tf.function decorator which can help you translate your Python functions to Tensorflow graphs. This way you can retain both code readability and ease of debugging while getting the performance of TensorFlow graphs.  Another major update is that many confusing redundancies have been consolidated and many functions are now integrated with Keras API. This will help to standardize the communication of data/models among various components of TensorFlow ecosystem. TF 2.0 also comes with backward compatibility to TF 1.X with an easy optional way to convert your TF 1.X code into TF 2.0. TF 1.X suffered from a lack of standardization in how we load/save trained machine learning models. TF 2.0 fixed this by defining a single API SavedModels. As SavedModels is integrated with the Tensorflow ecosystem, it becomes much easier to deploy models using Tensorflow Lite, Tensorflow.js to other devices/applications.   With the onset of TensorFlow 2, Tensorflow and Keras are integrated into one module (tf.keras). TF 2.0 now delivers Keras as the central high-level API used to build and train models. What is the future/benefits of TensorFlow + Keras?  Keras has been a very popular high-level API for faster prototyping and production and even for research. As the field of AI/ML is in nascent stages, ease of development can have a huge impact for people getting started in machine learning.  Previously, a developer new to machine learning started from Keras while an experienced researcher used only Tensorflow 1.x due to its flexibility to build custom models. With Keras integrated as a high level API for TF 2.0, we can expect both beginners and experts working on the same framework which can lead to better collaboration and better exchange of ideas in the community.  Additionally, a single high level easy to use API reduces confusion and streamlines consistency across use cases of production and research.  Overall, I think it’s a great step in the right direction by Google which will enable more developers to hop on the Tensorflow ecosystem.  On TensorFlow, NLP and structured learning Recently, Transformers 2.0, a popular OS NLP library, was released that provides TF 2.0 and PyTorch deep interoperability. What are your views on this development? One of the areas where deep learning has made an immense impact is Natural Language Processing (NLP). Research in NLP is moving very fast and it is hard to keep up with all the papers and code releases by various research groups around the world.  Hugging Face, the company behind the library “Transformers” has really eased the usage of state of the art (SOTA) models and process of building new models by simplifying the preprocessing and model building pipeline through an easy to use Keras like interface. “Transformers 2.0” is the recent release from the company and the most important feature is the interoperability between Pytorch and TF 2.0. TF 2.0 is more production-ready while Pytorch is more oriented towards research. With this upgrade, you can pretty much move from one framework to another for training, validation, and deployment of the model.  Interoperability between frameworks is very important for the AI community as it enables development velocity. Moreover, as none of the frameworks can be perfect at everything, it makes the framework developers focus more on their strengths and make those features seamless. This will create greater efficiency going forward. Overall, I think this is a great development and I expect other libraries in domains like Computer Vision, Graph Learning etc. to follow suit. This will enable a lot more application of state of the art models to production.  Google recently launched Neural Structured Learning (NSL), an open-source Tensorflow based framework for training neural networks with graphs and structured data. What are some of the potential applications of NSL? What do you think can be some Machine Learning Projects based around NSL? Neural structured learning is a concept of learning neural network parameters with structured signals other than features. Many real-world datasets contain some structured information like Knowledge graphs or molecular graphs in biology. Incorporating these signals can lead to a more accurate and robust model. From an implementation perspective, it boils down to adding a regularizer to the loss function such that the representation of neighboring nodes in the graph is similar.  Any application where the amount of labeled data is limited but has structural information like Knowledge Graph that can be exploited is a good candidate for these types of models. A possible example could be fraud detection in online systems. Fraud data generally has sparse labels and fraudsters create multiple accounts that are connected to each other through some information like devices etc. This structured information can be utilized to learn a better representation of fraud accounts.  There can be other applications is molecular data and other problems involving the knowledge graph. On Ankit’s experience working on his book, TensorFlow Machine Learning Project Tell us the motivation behind writing your book TensorFlow Machine Learning Projects. Why is TensorFlow ideal for building ML projects? What are some of your favorite machine learning projects from this book? When I started learning Tensorflow, I stumbled upon many tutorials (including the official ones) which explained various concepts on how Tensorflow works. While that was helpful in understanding the basics, most of my learning came from building projects with Tensorflow. That is when I realized the need for a resource that teaches using a ‘learn by doing’ approach. This book is unique in the way that it teaches machine learning theory, Tensorflow utilities and programming concepts all while developing a project in which you can have fun building and is also of practical use.  My favorite chapter from the book is “Generating Uncertainty in Traffic Signs Classifier using Bayesian Neural Networks”. With the development of self-driving cars, traffic signs detection is a major problem that needs to be solved. This chapter explains an advanced AI concept of Bayesian Neural Networks and shows step by step how to use those to detect traffic signs using Tensorflow. Some of the readers of the book have started to use this concept in their practical applications already. Machine Learning challenges and advice to those developing TensorFlow models What are the biggest challenges today in the field of Machine Learning and AI? What do you see as the greatest technology disruptors in the next 5 years? While AI and machine learning has seen huge success in recent years, there are few limitations of AI algorithms as we see today. Some of the major ones are: Labeled Data: Most of the success of AI has come from supervised learning. Many of the recent supervised deep learning algorithms require huge quantities of labeled data which is expensive to obtain. For example, obtaining huge amounts of clinical trial data for healthcare prediction is very challenging. The good news is that there is some research around building good ML models using sparse data labels. Explainability: Deep learning models are essentially a “black box” where you don’t know what factor(s) led to the prediction. For some applications like money lending, disease diagnosis, fraud detection etc. the explanations of predictions become very important. Currently, we see some nascent work in this direction with LIME and SHAP libraries. Generalization: In the current state of AI, we build one model for each application. We still don’t have good generality of models from one task to another. Generalization, if solved, can lead us to truly Artificial General Intelligence (AGI). Thankfully approaches like transfer learning and meta-learning are trying to solve this challenge. Bias, Fairness, and Ethics: An output of the machine learning model is heavily based on the input training data. Many a time, training data can have biases towards particular ethnicities, classes, religions, etc. We need more solutions in this direction to build trust in AI algorithms. Overall, I feel, AI is becoming mainstream and in the next 5 years we will see many traditional industries adopt AI to solve critical business problems and achieve more automation. At the same time, tooling for AI will keep on improving which will also help in its adoption. What is your advice for those developing machine learning projects on TensorFlow? Building projects with new techniques and technologies is a hard process. It requires patience, dealing with failures and hard work. For that reason, it is very important to pick up a project that you are passionate about. This way, you will continue building even if you are stuck somewhere. The selection of the right projects is by far the most important criterion in the project-based learning method.  About the Author Ankit currently works as a Senior Research Scientist at Uber AI Labs, the machine learning research arm of Uber. His work primarily involves the application of Deep Learning methods to a variety of Uber’s problems ranging from food recommendation system, forecasting to self-driving cars.  Previously, he has worked in a variety of data science roles at Bank of America, Facebook and other startups. Additionally, he has been a featured speaker in many of the top AI conferences and universities across the US, including UC Berkeley, OReilly AI conference etc. He completed his MS from UC Berkeley and a BS from IIT Bombay (India). You can find him on Linkedin, Twitter, and GitHub. About the Book With the help of this book, TensorFlow Machine Learning Projects you’ll not only learn how to build advanced projects using different datasets but also be able to tackle common challenges using a range of libraries from the TensorFlow ecosystem. To start with, you’ll get to grips with using TensorFlow for machine learning projects; you’ll explore a wide range of projects using TensorForest and TensorBoard for detecting exoplanets, TensorFlow.js for sentiment analysis, and TensorFlow Lite for digit classification. As you make your way through the book, you’ll build projects in various real-world domains. By the end of this book, you’ll have gained the required expertise to build full-fledged machine learning projects at work.  
Read more
  • 0
  • 0
  • 4400

article-image-artificial-intelligence-data-science-and-big-data-in-2019-what-really-mattered
Richard Gall
16 Dec 2019
6 min read
Save for later

Artificial intelligence, data science, and big data in 2019: what really mattered

Richard Gall
16 Dec 2019
6 min read
The techlash hasn’t died down - it’s just become normalized. Barely a day passes without a new scandal emerging, from questionable surveillance to racist AI algorithms. But it hasn’t all been bad: while negatives get a lot of attention (and so they should - the consequences of tech can be lethal, both societally and literally), there was still plenty to get excited about. And for those working in the data profession - as analysts, scientists, and engineers, there were several important trends that really helped to define where we are now from a purely practical perspective - as well as hinting at where we might go in the future. With just a few weeks left to go of the year (and the decade!), let’s look at some of the key things that defined this year in the field of data science and data engineering. The growth of PyTorch TensorFlow is undoubtedly the most popular deep learning framework. You might even say that its role in popularizing deep learning and artificial intelligence has been understated. But while TensorFlow has held its place for some time, 2019 was the year when things started to change. Look, for example at this Google Trends graph (and yes, I know it’s not in any way scientific): As you can see TensorFlow hit its stride pretty early on. It’s only in the last 12 months or so that PyTorch has been narrowing the gap. One of the reasons for this is the fact that PyTorch 1.0 was released at the end of last year. This has been the foundation that has spurred its growth over the last 12 months, effectively announcing its ‘official’ arrival on the scene. With Facebook (PyTorch’s creator) building on this foundation throughout the year with a few small but important releases. PyTorch 1.3, for example, which was released at the PyTorch Developer Conference in October, included a number of ‘experimental’ new features, including named tensors and PyTorch Mobile. Another reason for PyTorch’s growth this year is that it is finding traction in the research field. This article provides some hard data that proves that PyTorch is starting to grow in this area, citing the tool’s comparable simplicity, API and performance as the reasons that it’s undermining TensorFlow’s utter dominance of the field. Find our PyTorch bundle, and other data bundles, here. Grab 5 titles for just $25. TensorFlow 2.0 While PyTorch has grown significantly in 2019, TensorFlow is nevertheless still holding its place at the top of the deep learning rankings. And TensorFlow 2.0 has undoubtedly cemented its position. With the alpha release getting developers excited since March, the full launch of 2.0 marked an important milestone for the project. The key difference between TensorFlow 2.0 and 1.0 is ultimately accessibility and ease of use. Despite its massive popularity, TensorFlow 1.0 always had a reputation for being a little more difficult to use than many other deep learning tools. The team were clearly aware of this and have done a lot to make life easier for TensorFlow developers. “With tight integration of Keras into TensorFlow, eager execution by default, and Pythonic function execution,” the team write in the release notes, “TensorFlow 2.0 makes the experience of developing applications as familiar as possible for Python developers.” When placed alongside the exciting development of PyTorch, it’s clear that these two tools are going to be defining deep learning in the year - or years - to come. Get up to date with what's new in TensorFlow 2.0 with TensorFlow 2.0 Quick Start Guide. Stream processing with Kafka, Flink, and others Dealing with large quantities of data in real-time is now the cutting-edge of big data. It’s for this reason that this year we’ve started to see stream processing gain headway in the mainstream. Although it’s been an important technique for organizations with data-intensive needs, the use of cloud and hybrid solutions - as well as an overall awareness of the opportunities of real-time data - has become truly mainstream. In turn, this is giving new prominence to a range of stream-processing platforms. Kafka, Spark, and Flink are just three of the most well-known names in this space, but the market is undoubtedly growing. Another key driver here is Nvidia - as one of the leading hardware companies, it deserves a lot of credit for helping to make massive processing power accessible to organizations that wouldn’t have had a chance just a few years ago. With CUDA, Nvidia’s parallel programming paradigm for GPUs, the company is helping all sorts of users to leverage stream processing in different ways. Get started with Apache Kafka with Apache Kafka Quick Start Guide. Data analysis on the cloud Although I've already mentioned how influential TensorFlow was in popularizing deep learning, today public cloud is going even further. It’s making artificial intelligence and analytics accessible to new roles (thinking here about tools like Azure Machine Learning Studio and Amazon SageMaker), as well as making it easier to build and deploy machine learning models in applications and products. In recent weeks, Microsoft has made another step in its bid to eat into AWS’s market share with Azure Synapse. Essentially a next generation Azure SQL Warehouse, Synapse is designed to bridge the gap between data lake and data warehouse - so, offering massive scale, and improving analytical speed. It will be interesting to see how this plays with the wider market. AWS might respond with something similar - but the onus remains on Microsoft to shift mindshare; AWS will want to consolidate its powerful position. Security It would be wrong to suggest that security is a new issue in the world of data engineering and analytics. But in 2019 it’s become almost impossible to think about the two domains as separate from one another. This cuts two different ways: on the one hand the emphasis on securing data and protecting privacy has never been greater. On the other hand, artificial intelligence and machine learning have started to play a critical part in the way that we monitor and identify threats to our systems. To a certain extent this expresses the double bind that data poses: the amount of data at our disposal is a nightmare from a governance and architectural perspective, but it is, at the same time, a way of mitigating that very nightmare. All in all, then, a bit of a vicious cycle, but nevertheless a reminder that however big our data gets, and however much we try to automate, there will always be a need for humans to think creatively and strategically about how we actually go about solving problems. Explore Packt's security bundles now. For more technology eBooks and videos to prepare you for 2020, head to the Packt store.
Read more
  • 0
  • 0
  • 4371
Visually different images

article-image-deep-learning-models-have-massive-carbon-footprints-can-photonic-chips-help-reduce-power-consumption
Sugandha Lahoti
11 Jun 2019
10 min read
Save for later

Deep learning models have massive carbon footprints, can photonic chips help reduce power consumption?

Sugandha Lahoti
11 Jun 2019
10 min read
Most of the recent breakthroughs in Artificial Intelligence are driven by data and computation. What is essentially missing is the energy cost. Most large AI networks require huge number of training data to ensure accuracy. However, these accuracy improvements depend on the availability of exceptionally large computational resources. The larger the computation resource, the more energy it consumes. This  not only is costly financially (due to the cost of hardware, cloud compute, and electricity) but is also straining the environment, due to the carbon footprint required to fuel modern tensor processing hardware. Considering the climate change repercussions we are facing on a daily basis, consensus is building on the need for AI research ethics to include a focus on minimizing and offsetting the carbon footprint of research. Researchers should also put energy cost in results of research papers alongside time, accuracy, etc. The process of deep learning outsizing environmental impact was further highlighted in a recent research paper published by MIT researchers. In the paper titled “Energy and Policy Considerations for Deep Learning in NLP”, researchers performed a life cycle assessment for training several common large AI models. They quantified the approximate financial and environmental costs of training a variety of recently successful neural network models for NLP and provided recommendations to reduce costs and improve equity in NLP research and practice. They have also provided recommendations to reduce costs and improve equity in NLP research and practice. Per the paper, training AI models can emit more than 626,000 pounds of carbon dioxide equivalent—nearly five times the lifetime emissions of the average American car (and that includes the manufacture of the car itself). It is estimated that we must cut carbon emissions by half over the next decade to deter escalating rates of natural disaster. Source This speaks volumes about the carbon offset and brings conversation to the returns on heavy (carbon) investment of deep learning and if it is really worth the marginal improvement in predictive accuracy over cheaper, alternative methods. This news alarmed people tremendously. https://twitter.com/sakthigeek/status/1137555650718908416 https://twitter.com/vinodkpg/status/1129605865760149504 https://twitter.com/Kobotic/status/1137681505541484545         Even if some of this energy may come from renewable or carbon credit-offset resources, the high energy demands of these models are still a concern. This is because the current energy is derived from carbon-neural sources in many locations, and even when renewable energy is available, it is limited to the equipment produced to store it. The carbon footprint of NLP models The researchers in this paper adhere specifically to NLP models. They looked at four models, the Transformer, ELMo, BERT, and GPT-2, and trained each on a single GPU for up to a day to measure its power draw. Next, they used the number of training hours listed in the model’s original papers to calculate the total energy consumed over the complete training process. This number was then converted into pounds of carbon dioxide equivalent based on the average energy mix in the US, which closely matches the energy mix used by Amazon’s AWS, the largest cloud services provider. Source The researchers found that environmental costs of training grew proportionally to model size. It exponentially increased when additional tuning steps were used to increase the model’s final accuracy. In particular, neural architecture search had high associated costs for little performance benefit. Neural architecture search is a tuning process which tries to optimize a model by incrementally tweaking a neural network’s design through exhaustive trial and error. The researchers also noted that these figures should only be considered as baseline. In practice, AI researchers mostly develop a new model from scratch or adapt an existing model to a new data set, both require many more rounds of training and tuning. Based on their findings, the authors recommend certain proposals to heighten the awareness of this issue to the NLP community and promote mindful practice and policy: Researchers should report training time and sensitivity to hyperparameters. There should be a standard, hardware independent measurement of training time, such as gigaflops required to convergence. There should also be a  standard measurement of model sensitivity to data and hyperparameters, such as variance with respect to hyperparameters searched. Academic researchers should get equitable access to computation resources. This trend toward training huge models on tons of data is not feasible for academics, because they don’t have the computational resources. It will be more cost effective for academic researchers to pool resources to build shared compute centers at the level of funding agencies, such as the U.S. National Science Foundation. Researchers should prioritize computationally efficient hardware and algorithms. For instance, developers could aid in reducing the energy associated with model tuning by providing easy-to-use APIs implementing more efficient alternatives to brute-force. The next step is to introduce energy costs as a standard metric, that researchers are expected to report their findings. They should also try to minimise carbon footprint by developing compute efficient training methods such as new ML algos, or new engineering tools to make existing ones more compute efficient. Above all, we need to formulate strict public policies that steer digital technologies toward speeding a clean energy transition while mitigating the risks. Another factor which contributes to high energy consumptions are Optical neural networks which are used for most deep learning tasks. To tackle that issue, researchers and major tech companies — including Google, IBM, and Tesla — have developed “AI accelerators,” specialized chips that improve the speed and efficiency of training and testing neural networks. However, these AI accelerators use electricity and have a theoretical minimum limit for energy consumption. Also, most present day ASICs are based on CMOS technology and suffer from the interconnect problem. Even in highly optimized architectures where data are stored in register files close to the logic units, a majority of the energy consumption comes from data movement, not logic. Analog crossbar arrays based on CMOS gates or memristors promise better performance, but as analog electronic devices, they suffer from calibration issues and limited accuracy. Implementing chips that use light instead of electricity Another group of MIT researchers have developed a “photonic” chip that uses light instead of electricity, and consumes relatively little power in the process. The photonic accelerator uses more compact optical components and optical signal-processing techniques, to drastically reduce both power consumption and chip area. Practical applications for such chips can also include reducing energy consumption in data centers. “In response to vast increases in data storage and computational capacity in the last decade, the amount of energy used by data centers has doubled every four years, and is expected to triple in the next 10 years.” https://twitter.com/profwernimont/status/1137402420823306240 The chip could be used to process massive neural networks millions of times more efficiently than today’s classical computers. How the photonic chip works? The researchers have given a detailed explanation of the chip’s working in their research paper, “Large-Scale Optical Neural Networks Based on Photoelectric Multiplication”. The chip relies on a compact, energy efficient “optoelectronic” scheme that encodes data with optical signals, but uses “balanced homodyne detection” for matrix multiplication. This technique that produces a measurable electrical signal after calculating the product of the amplitudes (wave heights) of two optical signals. Pulses of light encoded with information about the input and output neurons for each neural network layer — which are needed to train the network — flow through a single channel.  Optical signals carrying the neuron and weight data fan out to grid of homodyne photodetectors. The photodetectors use the amplitude of the signals to compute an output value for each neuron. Each detector feeds an electrical output signal for each neuron into a modulator, which converts the signal back into a light pulse. That optical signal becomes the input for the next layer, and so on. Limitation of Photonic accelerators Photonic accelerators generally have an unavoidable noise in the signal. The more light that’s fed into the chip, the less noise and greater accuracy. Less input light increases efficiency but negatively impacts the neural network’s performance. The ideal condition is achieved when AI accelerators is measured in how many joules it takes to perform a single operation of multiplying two numbers. Traditional accelerators are measured in picojoules, or one-trillionth of a joule. Photonic accelerators measure in attojoules, which is a million times more efficient. In their simulations, the researchers found their photonic accelerator could operate with sub-attojoule efficiency. Tech companies are the largest contributors of carbon footprint The realization that training an AI model can produce emissions equivalent to a five cars, should make carbon footprint of artificial intelligence an important consideration for researchers and companies going forward. UMass Amherst’s Emma Strubell, one of the research team and co-author of the paper said, “I’m not against energy use in the name of advancing science, obviously, but I think we could do better in terms of considering the trade off between required energy and resulting model improvement.” “I think large tech companies that use AI throughout their products are likely the largest contributors to this type of energy use,” Strubell said. “I do think that they are increasingly aware of these issues, and there are also financial incentives for them to curb energy use.” In 2016, Google’s ‘DeepMind’ was able to reduce the energy required to cool Google Data Centers by 30%. This full-fledged AI system has features including continuous monitoring and human override. Recently Microsoft doubled its internal carbon fee to $15 per metric ton on all carbon emissions. The funds from this higher fee will maintain Microsoft’s carbon neutrality and help meet their sustainability goals. On the other hand, Microsoft is also two years into a seven-year deal—rumored to be worth over a billion dollars—to help Chevron, one of the world’s largest oil companies, better extract and distribute oil. https://twitter.com/AkwyZ/status/1137020554567987200 Amazon had announced that it would power data centers with 100 percent renewable energy without a dedicated timeline. Since 2018 Amazon has reportedly slowed down its efforts to use renewable energy using only 50 percent.  It has also not announced any new deals to supply clean energy to its data centers since 2016, according to a report by Greenpeace, and it quietly abandoned plans for one of its last scheduled wind farms last year. In April, over 4,520 Amazon employees organized against Amazon’s continued profiting from climate devastation. However, Amazon rejected all 11 shareholder proposals including the employee-led climate resolution at Annual shareholder meeting. Both these studies’ researchers illustrate the dire need to change our outlook towards building Artificial Intelligence models and chips that have an impact on the carbon footprint. However, this does not mean halting the research of AI altogether. Instead there should be an awareness of the environmental impact that training AI models might have. Which in turn can inspire researchers to develop more efficient hardware and algorithms for the future. Responsible tech leadership or climate washing? Microsoft hikes its carbon tax and announces new initiatives to tackle climate change. Microsoft researchers introduce a new climate forecasting model and a public dataset to train these models. Now there’s a CycleGAN to visualize the effects of climate change. But is this enough to mobilize action?
Read more
  • 0
  • 0
  • 4218

article-image-brad-miro-talks-tensorflow-2-0-features-and-how-google-is-using-it-internally
Sugandha Lahoti
10 Dec 2019
6 min read
Save for later

Brad Miro talks TensorFlow 2.0 features and how Google is using it internally

Sugandha Lahoti
10 Dec 2019
6 min read
TensorFlow 2.0, released in October, has got developers excited about a myriad of features and its ease of use.  At the EuroPython Conference 2019, Brad Miro, developer programs engineer at Google talked about the updates being made to TensorFlow 2.0. He also gave an overview of how Google is using TensorFlow, moving on to why Python is important for TensorFlow development and how to migrate from TF 1.x to TF 2.0. EuroPython is one of the most popular Python programming language community conferences. Below are some highlights from Brad’s talk at EuroPython. What is TensorFlow? TensorFlow, an open-source deep learning library developed at Google, first released in 2015. It’s a Python framework that includes a number of utilities for helping you write deep neural networks supporting both GPUs and TPUs. A lot of deep learning involves using mathematics, statistics, and algebra and perform low-level optimizations with your system. TensorFlow removes a lot of those abstractions leaving you to focus on actually writing your model. How TensorFlow is used internally at Google Tensorflow is used internally at Google to power all of its machine learning and AI. Google’s data centers are powered using AI and TensorFlow to help optimize the usage of these data centers to reduce bandwidth, to ensure network connections are optimized, and to reduce power consumption. TensorFlow also is useful for performing global localization in Google Maps. It is also used heavily in the Google Pixel range of smartphones to help optimize the software. These technologies are also used in medical research specifically in the field of Computer Vision. For example, Tensorflow is used to distinguish between the retinal image of a healthy eye from the retinal image of an eye that has diabetic retinopathy.   Further Learning If you want to learn to build more computer vision applications with TensorFlow 2.0, check out the book Hands-On Computer Vision with TensorFlow 2 by Benjamin Planche, and Eliot Andres. This book by Packt Publishing is a practical guide to building high-performance systems for object detection, segmentation, video processing, smartphone applications, and more. By the end of the book, you will have both the theoretical understanding and practical skills to solve advanced computer vision problems with TensorFlow 2.0. Furthermore, Google is using AI and TensorFlow to predict whether or not objects in space are planets. To summarize, they use AI to predict whether or not fluctuations in the brightness of an object is due to it being a planet.   Why Python is so important for TensorFlow Python has always been the choice for TensorFlow due to the language being extremely easy to use and having a rich ecosystem for data science including tools such as numpy, scikit-learn, and pandas. When TensorFlow was being built, the idea was that it should have the simplicity of numpy, performance of C but ease of use of Python.  What does TensorFlow 2.0 bring to the table TensorFlow 2.0 is powerful, flexible, scalable and easily deployable.  What’s gone Session.run tf.control_dependencies tf.global_variables_initializer tf.cond, tf.while_loop Tf.contrib What’s new Eager execution enabled by default  tf.function Keras as main high level API Distribution Strategy API SavedModel API  TensorFlow 2.0 had a major API Cleanup. Many API symbols are removed or renamed for better consistency and clarity.  Session.run has been replaced with eager execution which effectively means that your tensorflow code runs like numpy code.  Eager execution enables fast iteration and intuitive debugging without building a graph. It also makes creating and experimenting with models using TensorFlow easier. It can be especially useful when using the tf.keras model subclassing API. TensorFlow 2.0 has tf.function, a python decorator that lets you run regular Python code which is later compiled down to TensorFlow code using AutoGraph. The Distribution Strategy API in TensorFlow 2.0 allows machine learning researchers to distribute training across a wide variety of compute configurations. This release also allows distributed training with Keras’ model.fit and custom training loops. Keras is introduced as the main high-level API. Keras is a popular high-level API used for easy and fast prototyping, building, and training of deep learning models. This will enable developers to easily leverage their various model-building APIs. Using Keras with TensorFlow has two main methods.  Symbolic (Keras sequential) Your model is a graph of layers Any graph you compile will run  TensorFlow helps you debug by catching errors at compile time Imperative method (Keras subclassing) Your model is Python bytecode Complete flexibility and control  Harder to debug/ Harder to maintain  There are pros and cons of using each method; it really just depends on what your specific use cases are. The SavedModel API allows you to save your trained ML model into a language-neutral format. With TensorFlow 2.0, all TensorFlow ecosystem projects including TensorFlow Lite, TensorFlow JS, TensorFlow Serving, and TensorFlow Hub, support SavedModels. On Tensorflow Hub, you can store and download pre-built models. You can use TensorFlow Extended which is a Python library that can be run on your servers to productionalize your models. TensorFlow Lite lets you run your TensorFlow models on edge devices. With TensorFlow.js, you can run machine learning models using javascript in the browser or run them on servers using node. TensorFlow also has Swift for TensorFlow to help developers use Swift to develop machine learning models. “Swift for TensorFlow provides a new programming model that combines the performance of graphs with the flexibility and expressivity of Eager execution, with a strong focus on improved usability at every level of the stack. This is not just a TensorFlow API wrapper written in Swift — we added compiler and language enhancements to Swift to provide a first-class user experience for machine learning developers.”  Other packages that exist in the TensorFlow ecosystem used for niche use cases are TF Probability, TF Agents (reinforcement learning), Tensor2Tensor, TF Ranking, TF Text (natural language processing), TF Federated, TF privacy and more.  How to upgrade from TensorFlow 1.x to TensorFlow 2.0 There are several migration guides available on TensorFlow’s website. You can also use the tf.compat.v1 library for backwards compatibility and the tf_upgrade_v2 script which you can execute on top of any Python script to convert TF 1.x code to 2.0 code. You can also read more about TF 2.0 migration in our book Hands-On Computer Vision with TensorFlow 2 which introduces the automatic migration tool  and compares TensorFlow 1 concepts with their TensorFlow 2 counterparts with a detailed guide on migrating to idiomatic TensorFlow 2 code. You can watch Brad’s full talk on YouTube. This video is licensed under the CC BY-NC-SA 3.0 license.  TensorFlow.js contributor Kai Sasaki on how TensorFlow.js eases web-based machine learning application development Introducing Spleeter, a Tensorflow based python library that extracts voice and sound from any music track. TensorFlow 2.0 released with tighter Keras integration, eager execution enabled by default, and more!
Read more
  • 0
  • 0
  • 4213

article-image-challenge-deep-learning-sustain-current-pace-innovation-ivan-vasilev-machine-learning-engineer
Sugandha Lahoti
13 Dec 2019
8 min read
Save for later

“The challenge in Deep Learning is to sustain the current pace of innovation”, explains Ivan Vasilev, machine learning engineer

Sugandha Lahoti
13 Dec 2019
8 min read
If we talk about recent breakthroughs in the software community, machine learning and deep learning is a major contender - the usage, adoption, and experimentation of deep learning has exponentially increased. Especially in the areas of computer vision, speech, natural language processing and understanding, deep learning has made unprecedented progress. GANs, variational autoencoders and deep reinforcement learning are also creating impressive AI results. To know more about the progress of deep learning, we interviewed Ivan Vasilev, a machine learning engineer and researcher based in Bulgaria. Ivan is also the author of the book Advanced Deep Learning with Python. In this book, he teaches advanced deep learning topics like attention mechanism, meta-learning, graph neural networks, memory augmented neural networks, and more using the Python ecosystem. In this interview, he shares his experiences working on this book, compares TensorFlow and PyTorch, as well as talks about computer vision, NLP, and GANs. On why he chose Computer Vision and NLP as two major focus areas of his book Computer Vision and Natural Language processing are two popular areas where a number of developments are ongoing. In his book, Advanced Deep Learning with Python, Ivan delves deep into these two broad application areas. “One of the reasons I emphasized computer vision and NLP”, he clarifies, “is that these fields have a broad range of real-world commercial applications, which makes them interesting for a large number of people.” The other reason for focusing on Computer Vision, he says “is because of the natural (or human-driven if you wish) progress of deep learning. One of the first modern breakthroughs was in 2012, when a solution based on convolutional network won the ImageNet competition of that year with a large margin compared to any previous algorithms. Thanks in part to this impressive result, the interest in the field was renewed and brought many other advances including solving complex tasks like object detection and new generative models like generative adversarial networks. In parallel, the NLP domain saw its own wave of innovation with things like word vector embeddings and the attention mechanism.” On the ongoing battle between TensorFlow and PyTorch There are two popular machine learning frameworks that are currently at par - TensorFlow and PyTorch (Both had new releases in the past month, TensorFlow 2.0 and PyTorch 1.3). There is an ongoing debate that pitches TensorFlow and PyTorch as rivaling tech and communities. Ivan does not think there is a clear winner between the two libraries and this is why he has included them both in the book. He explains, “On the one hand, it seems that the API of PyTorch is more streamlined and the library is more popular with the academic community. On the other hand, TensorFlow seems to have better cloud support and enterprise features. In any case, developers will only benefit from the competition. For example, PyTorch has demonstrated the importance of eager execution and TensorFlow 2.0 now has much better support for eager execution to the point that it is enabled by default. In the past, TensorFlow had internal competing APIs, whereas now Keras is promoted as its main high-level API. On the other hand, PyTorch 1.3 has introduced experimental support for iOS and Android devices and quantization (computation operations with reduced precision for increased efficiency).” Using Machine Learning in the stock trading process can make markets more efficient Ivan discusses his venture into the field of financial machine learning, being the author of an ML-oriented event-based algorithmic trading library. However, financial machine learning (and stock price prediction in particular) is usually not in the focus of mainstream deep learning research. “One reason”, Ivan states, “is that the field isn’t as appealing as, say, computer vision or NLP. At first glance, it might even appear gimmicky to predict stock prices.” He adds, “Another reason is that quality training data isn’t freely available and can be quite expensive to obtain. Even if you have such data, pre-processing it in an ML-friendly way is not a straightforward process, because the noise-to-signal ratio is a lot higher compared to images or text. Additionally, the data itself could have huge volume.” “However”, he counters, “using ML in finance could have benefits, besides the obvious (getting rich by trading stocks). The participation of ML algorithms in the stock trading process can make the markets more efficient. This efficiency will make it harder for market imbalances to stay unnoticed for long periods of time. Such imbalances will be corrected early, thus preventing painful market corrections, which could otherwise lead to economic recessions.” GANs can be used for nefarious purposes, but that doesn’t warrant discarding them Ivan has also given a special emphasis to Generative adversarial networks in his book. Although extremely useful, in recent times GANs have been used to generate high-dimensional fake data that look very convincing. Many researchers and developers have raised concerns about the negative repercussions of using GANs and wondered if it is even possible to prevent and counter its misuse/abuse. Ivan acknowledges that GANs may have unintended outcomes but that shouldn’t be the sole reason to discard them. He says, “Besides great entertainment value, GANs have some very useful applications and could help us better understand the inner workings of neural networks. But as you mentioned, they can be used for nefarious purposes as well. Still, we shouldn’t discard GANs (or any algorithm with similar purpose) because of this. If only because the bad actors won’t discard them. I think the solution to this problem lies beyond the realm of deep learning. We should strive to educate the public on the possible adverse effects of these algorithms, but also to their benefits. In this way we can raise the awareness of machine learning and spark an honest debate about its role in our society.” Machine learning can have both intentional and unintentional harmful effects Awareness and Ethics go in parallel. Ethics is one of the most important topics to emerge in machine learning and artificial intelligence over the last year. Ivan agrees that the ethics and algorithmic bias in machine learning are of extreme importance. He says, “We can view the potential harmful effects of machine learning as either intentional and unintentional. For example, the bad actors I mentioned when we discussed GANs fall into the intentional category. We can limit their influence by striving to keep the cutting edge of ML research publicly available, thus denying them any unfair advantage of potentially better algorithms. Fortunately, this is largely the case now and hopefully will remain that way in the future. “ “I don't think algorithmic bias is necessarily intentional,'' he says. “Instead, I believe that it is the result of the underlying injustices in our society, which creep into ML through either skewed training datasets or unconscious bias of the researchers. Although the bias might not be intentional, we still have a responsibility to put a conscious effort to eliminate it.” Challenges in the Machine learning ecosystem “The field of ML exploded (in a good sense) a few years ago,'' says Ivan, “thanks to a combination of algorithmic and computer hardware advances. Since then, the researches have introduced new smarter and more elegant deep learning algorithms. But history has shown that AI can generate such a great hype that even the impressive achievements of the last few years could fall short of the expectations of the general public.” “So, in a broader sense, the challenge in front of ML is to sustain the current pace of innovation. In particular, current deep learning algorithms fall short in some key intelligence areas, where humans excel. For example, neural networks have a hard time learning multiple unrelated tasks. They also tend to perform better when working with unstructured data (like images), compared to structured data (like graphs).” “Another issue is that neural networks sometimes struggle to remember long-distance dependencies in sequential data. Solving these problems might require new fundamental breakthroughs, and it’s hard to give an estimation of such one time events. But even at the current level, ML can fundamentally change our society (hopefully for the better). For instance, in the next 5 to 10 years, we can see the widespread introduction of fully autonomous vehicles, which have the potential to transform our lives.” This is just a snapshot of some of the important focus areas in the deep learning ecosystem. You can check out more of Ivan’s work in his book Advanced Deep Learning with Python. In this book you will investigate and train CNN models with GPU accelerated libraries like TensorFlow and PyTorch. You will also apply deep neural networks to state-of-the-art domains like computer vision problems, NLP, GANs, and more. Author Bio Ivan Vasilev started working on the first open source Java Deep Learning library with GPU support in 2013. The library was acquired by a German company, where he continued its development. He has also worked as a machine learning engineer and researcher in the area of medical image classification and segmentation with deep neural networks. Since 2017 he has focused on financial machine learning. He is working on a Python based platform, which provides the infrastructure to rapidly experiment with different ML algorithms for algorithmic trading. You can find him on Linkedin and GitHub. Kaggle’s Rachel Tatman on what to do when applying deep learning is overkill  Brad Miro talks TensorFlow 2.0 features and how Google is using it internally François Chollet, creator of Keras on TensorFlow 2.0 and Keras integration, tricky design decisions in deep learning and more
Read more
  • 0
  • 0
  • 4177
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at €14.99/month. Cancel anytime
article-image-microsoft-airbnb-genentech-toyota-pytorch-to-build-deploy-production-ready-ai
Sugandha Lahoti
10 Dec 2019
6 min read
Save for later

How Microsoft, Airbnb, Genentech, and Toyota are using PyTorch to build and deploy production-ready AI

Sugandha Lahoti
10 Dec 2019
6 min read
Built by Facebook engineers and researchers, Pytorch is an open-source Python-based deep learning framework for developing new machine learning models, explore neural network architecture and deploy them at scale in production.  PyTorch is known for its advanced indexing and functions, imperative style, integration support, and API simplicity. This is one of the key reasons why developers prefer this framework for research and hackability. PyTorch is also the second-fastest-growing open source project on the GitHub community which includes anybody from developers starting to get acquainted with AI to some of the best known AI researchers and some of the best-known companies doing AI.  At its F8 annual developer conference, Facebook shared how production-ready PyTorch 1.0 is being adopted by the community and the industry. If you want to learn how you can use this framework to build projects in machine intelligence and deep learning, you may go through our book PyTorch Deep Learning Hands-On by authors Sherin Thomas and Sudhanshu Passi. This book demonstrates numerous examples and dynamic AI applications and demonstrates the simplicity and efficiency of PyTorch.  A number of companies are using PyTorch for research and for production. At F8 developer conference this year, Jerome Pesenti, Vice President of AI at Facebook introduced representatives from Microsoft, Airbnb, Genentech, and Toyota Research Institute who talked about how the framework is helping them build, train, and deploy production-ready AI. Below are some excerpts from their talks. Read also: How PyTorch is bridging the gap between research and production at Facebook: PyTorch team at F8 conference How Microsoft uses PyTorch for its language modeling service David Aronchick, Head of Open Source Machine Learning Strategy at Microsoft Azure  At Microsoft, PyTorch is being used in their language modeling service. Language modeling service uses state-of-the-art language models for both 1 P (first-party) and 3 P (third party). Microsoft explored a number of deep learning frameworks but was running into several issues. These included a slow transition from research to production, inconsistent and frequently changing APIs, and a trade-off between high-level ease-of-use and low-level flexibility.  To overcome these issues, in partnership with Facebook Microsoft built an internal language modeling toolkit on top of PyTorch. Using the native extensibility that PyTorch provided, Microsoft was able to build advanced/custom tasks and architecture. It also improved the onboarding of new users and was an active and inviting community. As a result of this work, Microsoft was able to scale the language modeling features to billions of words. It also led to intuitive, static, and consistent APIs which resulted in seamless migration from Language modeling toolkit v0.4 to 1.0. They also saw improvements in model sizes. Microsoft have partnered with ics.ai to deliver conversational AI bots across the public sector in the UK. ICS.ai, based in Basingstoke, have trained their Microsoft AI driven chat bots to scale to the demands of large county councils, healthcare trusts and universities. How Airbnb is using conversational AI tools in PyTorch to enhance customer experience Cindy Chen, Senior machine learning Data Scientist at Airbnb Airbnb has built a dialog assistant to integrate smart replies and enhance their customer experience. The core of their Dialog assistant for customer service at Airbnb is powered by PyTorch. They have built the smart replies recommendation model by treating it as a machine translation problem.  Airbnb is translating the customer's input message into agent responses by building a sequence to sequence model. They leverage PyTorch’s Open neural machine translation library to build the sequence to sequence model.  Using Pytorch has significantly sped up the Airbnb’s model development cycle as PyTorch provides them with state-of-the-art technologies such as various attention mechanisms and beam search.  How Genentech uses Pytorch in drug discovery and cancer therapy Daniel Bozinov, Head of AI - Early clinical development informatics, Genentech At Genentech, PyTorch is being used to develop personalized cancer medicine as well as for drug discovery and in cancer therapy.  For drug development, Genentech has built deep learning models for specific domains to make some predictions about the properties of molecules such as toxicity. They're also applying AI to come up with new cancer therapies. They identify unique molecules specific to cancer cells that are only produced by those cancer cells, potentially sensitizing the immune system to attack those cancer cells and basically treat them like an infection.   PyTorch has been their deep learning framework of choice because of features such as easier debugging, more flexible control structures, being natively pythonic, and it’s Dynamic graphs which yield in faster execution. Their model architecture is inspired by textual entailment in natural language processing. They use a partially recurrent neural network as well as a straightforward feed-forward network, combine the outputs of these two networks and predict the peptide binding. Toyota Research Institute adds new driver support features in cars Adrien Gaidon, Machine Learning Lead, Toyota Research Institute Toyota developed a cutting-edge cloud platform for distributed deep learning on high-resolution sensory inputs, especially video. This was designed to add new driver support features to the cars. PyTorch was instrumental in scaling up Toyota’s deep learning system because of features like simple API, integration with the global Python ecosystem, and overall a great user experience for fast exploration. It’s also fast for training on a very large scale. In addition to amping up TRI’s creativity and expertise, Pytorch has also amplified Toyota’s capabilities to iterate quickly from idea to real-world use cases. The team at TRI is excited for new Pytorch production features that will help them accelerate Toyota even further.  In this post, we have only summarized the talks. At F8, these researchers spoke in length about each of their company’s projects and how PyTorch has been instrumental in their growth. You can watch the full video on YouTube.  If you are inspired to build your PyTorch-based deep learning and machine learning models, we recommend you to go through our book PyTorch Deep Learning Hands-On. Facebook releases PyTorch 1.3 with named tensors, PyTorch Mobile, 8-bit model quantization, and more François Chollet, creator of Keras on TensorFlow 2.0 and Keras integration, tricky design decisions in Deep Learning, and more PyTorch announces the availability of PyTorch Hub for improving machine learning research reproducibility
Read more
  • 0
  • 0
  • 4140

article-image-data-professionals-planning-to-learn-this-year-python-deep-learning
Amey Varangaonkar
14 Jun 2018
4 min read
Save for later

What are data professionals planning to learn this year? Python, deep learning, yes. But also...

Amey Varangaonkar
14 Jun 2018
4 min read
One thing that every data professional absolutely dreads is the day their skills are no longer relevant in the market. In an ever-changing tech landscape, one must be constantly on the lookout for the most relevant, industrially-accepted tools and frameworks. This is applicable everywhere - from application and web developers to cybersecurity professionals. Not even the data professionals are excluded from this, as new ways and means to extract actionable insights from raw data are being found out almost every day. Gone are the days when data pros stuck to a single language and a framework to work with their data. Frameworks are more flexible now, with multiple dependencies across various tools and languages. Not just that, new domains are being identified where these frameworks can be applied, and how they can be applied varies massively as well. A whole new arena of possibilities has opened up, and with that new set of skills and toolkits to work on these domains have also been unlocked. What’s the next big thing for data professionals? We recently polled thousands of data professionals as part of our Skill-Up program, and got some very interesting insights into what they think the future of data science looks like. We asked them what they were planning to learn in the next 12 months. The following word cloud is the result of their responses, weighted by frequency of the tools they chose: What data professionals are planning on learning in the next 12 months Unsurprisingly, Python comes out on top as the language many data pros want to learn in the coming months. With its general-purpose nature and innumerable applications across various use-cases, Python’s sky-rocketing popularity is the reason everybody wants to learn it. Machine learning and AI are finding significant applications in the web development domain today. They are revolutionizing the customers’ digital experience through conversational UIs or chatbots. Not just that, smart machine learning algorithms are being used to personalize websites and their UX. With all these reasons, who wouldn’t want to learn JavaScript, as an important tool to have in their data science toolkit? Add to that the trending web dev framework Angular, and you have all the tools to build smart, responsive front-end web applications. We also saw data professionals taking active interest in the mobile and cloud domains as well. They aim to learn Kotlin and combine its power with the data science tools for developing smarter and more intelligent Android apps. When it comes to the cloud, Microsoft’s Azure platform has introduced many built-in machine learning capabilities, as well as a workbench for data scientists to develop effective, enterprise-grade models. Data professionals also prefer Docker containers to run their applications seamlessly, and hence its learning need seems to be quite high. [box type="shadow" align="" class="" width=""]Has machine learning with JavaScript caught your interest? Don’t worry, we got you covered - check out Hands-on Machine Learning with JavaScript for a practical, hands-on coverage of the essential machine learning concepts using the leading web development language. [/box] With Crypto’s popularity off the roof (sadly, we can’t say the same about Bitcoin’s price), data pros see Blockchain as a valuable skill. Building secure, decentralized apps is on the agenda for many, perhaps. Cloud, Big Data, Artificial Intelligence are some of the other domains that the data pros find interesting, and feel worth skilling up in. Work-related skills that data pros want to learn We also asked the data professionals what skills the data pros wanted to learn in the near future that could help them with their daily jobs more effectively. The following word cloud of their responses paints a pretty clear picture: Valuable skills data professionals want to learn for their everyday work As Machine learning and AI go mainstream, so do their applications in mainstream domains - often resulting in complex problems. Well, there’s deep learning and specifically neural networks to tackle these problems, and these are exactly the skills data pros want to master in order to excel at their work. [box type="shadow" align="" class="" width=""]Data pros want to learn Machine Learning in Python. Do you? Here’s a useful resource for you to get started - check out Python Machine Learning, Second Edition today![/box] So, there it is! What are the tools, languages or frameworks that you are planning to learn in the coming months? Do you agree with the results of the poll? Do let us know. What are web developers favorite front-end tools? Packt’s Skill Up report reveals all Data cleaning is the worst part of data analysis, say data scientists 15 Useful Python Libraries to make your Data Science tasks Easier
Read more
  • 0
  • 0
  • 4132

article-image-neurips-2018-rethinking-transparency-and-accountability-in-machine-learning
Bhagyashree R
16 Dec 2018
8 min read
Save for later

NeurIPS 2018: Rethinking transparency and accountability in machine learning

Bhagyashree R
16 Dec 2018
8 min read
Key takeaways from the discussion To solve problems with machine learning, you must first understand them. Different people or groups of people are going to define a problem in a different way. So, we shouldn't believe that the way we want to frame the problem computationally is the right way. If we allow that our systems include people and society, it is clear that we have to help negotiate values, not simply define them. Last week, at the 32nd NeurIPS 2018 annual conference, Nitin Koli, Joshua Kroll, and Deirdre Mulligan presented the common pitfalls we see when studying the human side of machine learning. Machine learning is being used in high-impact areas like medicine, criminal justice, employment, and education for making decisions. In recent years, we have seen that this use of machine learning and algorithmic decision making have resulted in unintended discrimination.  It’s becoming clear that even models developed with the best of intentions may exhibit discriminatory biases and perpetuate inequality. Although researchers have been analyzing how to put concepts like fairness, accountability, transparency, explanation, and interpretability into practice in machine learning, properly defining these things can prove a challenge. Attempts have been made to define them mathematically, but this can bring new problems. This is because applying mathematical logic to human concepts that have unique and contested political and social dimensions necessarily has blind spots - every point of contestation can’t be integrated into a single formula. In turn, this can cause friction with other disciplines as well as the public. Based on their research on what various terms mean in different contexts, Nitin Koli, Joshua Krill, and Deirdre Mulligan drew out some of the most common misconceptions machine learning researchers and practitioners hold. Sociotechnical problems To find a solution to a particular problem, data scientists need precise definitions. But how can we verify that these definitions are correct? Indeed, many definitions will be contested, depending on who you are and what you want them to mean. A definition that is fair to you will not necessarily be fair to me”, remarks Mr. Kroll. Mr. Kroll explained that while definitions can be unhelpful, they are nevertheless essential from a mathematical perspective.  This means there appears to be an unresolved conflict between concepts and mathematical rigor. But there might be a way forward. Perhaps it’s wrong to simply think in this dichotomy of logical rigor v. the messy reality of human concepts. One of the ways out of this impasse is to get beyond this dichotomy. Although it’s tempting to think of the technical and mathematical dimension on one side, with the social and political aspect on the other, we should instead see them as intricately related. They are, Kroll suggests, socio-technical problems. Kroll goes on to say that we cannot ignore the social consequences of machine learning: “Technologies don’t live in a vacuum and if we pretend that they do we kind of have put our blinders on and decided to ignore any human problems.” Fairness in machine learning In the real world, fairness is a concept directly linked to processes. Think, for example, of the voting system. Citizens cast votes to their preferred candidates and the candidate who receives the most support is elected. Here, we can say that even though the winning candidate was not the one a candidate voted for, but at least he/she got the chance to participate in the process. This type of fairness is called procedural fairness. However, in the technical world, fairness is often viewed in a subtly different way. When you place it in a mathematical context, fairness centers on outcome rather than process. Kohli highlighted that trade offs between these different concepts can’t be avoided. They’re inevitable. A mathematical definition of fairness places a constraint over the behavior of a system, and this constraint will narrow down the cause of models that can satisfy these conditions. So, if we decide to add too many fairness constraints to the system, some of them will be self-contradictory. One more important point machine learning practitioners should keep in mind is that when we talk about the fairness of a system, that system isn’t a self-contained and coherent thing. It is not a logical construct - it’s a social one. This means there are a whole host of values, ideas, and histories that have an impact on its reality.. In practice, this ultimately means that the complexity of the real world from which we draw and analyze data can have an impact on how a model works. Kohli explained this by saying, “it doesn’t really matter... whether you are building a fair system if the context in which it is developed and deployed in is fundamentally unfair.” Accountability in machine learning Accountability is ultimately about trust. It’s about the extent you can be sure you know what is ‘true’ about a system. It refers to the fact that you know how it works and why it does things in certain ways. In more practical terms, it’s all about invariance and reliability. To ensure accountability inside machine learning models, we need to follow a layered model. The bottom layer is an accounting or recording layer, that keeps track of what a given system is doing and the ways in which it might have been changed.. The next layer is a more analytical layer. This is where those records on the bottom layer are analyzed, with decisions made about performance - whether anything needs to be changed and how they should be changed. The final and top-most layer is about responsibility. It’s where the proverbial buck stops - with those outside of the algorithm, those involved in its construction. “Algorithms are not responsible, somebody is responsible for the algorithm,”  explains Kroll. Transparency Transparency is a concept heavily tied up with accountability. Arguably you have no accountability without transparency. The layered approach discussed above should help with transparency, but it’s also important to remember that transparency is about much more than simply making data and code available. Instead, it demands that the decisions made in the development of the system are made available and clear too. Mr. Kroll emphasizes, “to the person at the ground-level for whom the decisions are being taken by some sort of model, these technical disclosures aren’t really useful or understandable.” Explainability In his paper Explanation in Artificial Intelligence: Insights from the Social Sciences, Tim Miller describes what is explainable artificial intelligence. According to Miller, explanation takes many forms such as causal, contrastive, selective, and social. Causal explanation gives reasons behind why something happened, for example, while contrastive explanations can provide answers to questions like“Why P rather than not-P?". But the most important point here is that explanations are selective. An explanation cannot include all reasons why something happened; explanations are always context-specific, a response to a particular need or situation. Think of it this way: if someone asks you why the toaster isn’t working, you could just say that it’s broken. That might be satisfactory in some situations, but you could, of course, offer a more substantial explanation, outlining what was technically wrong with the toaster, how that technical fault came to be there, how the manufacturing process allowed that to happen, how the business would allow that manufacturing process to make that mistake… you could, of course, go on and on. Data is not the truth Today, there is a huge range of datasets available that will help you develop different machine learning models. These models can be useful, but it’s essential to remember that they are models. A model isn’t the truth - it’s an abstraction, a representation of the world in a very specific way. One way of taking this fact into account is the concept of ‘construct validity’. This sounds complicated, but all it really refers to is the extent to which a test - say a machine learning algorithm - actually measures what it says it’s trying to measure. The concept is widely used in disciplines like psychology, but in machine learning, it simply refers to the way we validate a model based on its historical predictive accuracy. In a nutshell, it’s important to remember that just as data is an abstraction of the world, models are also an abstraction of the data. There’s no way of changing this, but having an awareness that we’re dealing in abstractions ensures that we do not lapse into the mistake of thinking we are in the realm of ‘truth’. To build a fair(er) systems will ultimately require an interdisciplinary approach, involving domain experts working in a variety of fields. If machine learning and artificial intelligence is to make a valuable and positive impact in fields such as justice, education, and medicine, it’s vital that those working in those fields work closely with those with expertise in algorithms. This won’t fix everything, but it will be a more robust foundation from which we can begin to move forward. You can watch the full talk on the Facebook page of NeurIPS. Researchers unveil a new algorithm that allows analyzing high-dimensional data sets more effectively, at NeurIPS conference Accountability and algorithmic bias: Why diversity and inclusion matters [NeurIPS Invited Talk] NeurIPS 2018: A quick look at data visualization for Machine learning by Google PAIR researchers [Tutorial]
Read more
  • 0
  • 0
  • 4105

article-image-highest-paying-data-science-jobs-2017
Amey Varangaonkar
27 Nov 2017
10 min read
Save for later

Highest Paying Data Science Jobs in 2017

Amey Varangaonkar
27 Nov 2017
10 min read
It is no secret that this is the age of data. More data has been created in the last 2 years than ever before. Within the dumps of data created every second, businesses are looking for useful, action worthy insights which they can use to enhance their processes and thereby increase their revenue and profitability. As a result, the demand for data professionals, who sift through terabytes of data for accurate analysis and extract valuable business insights from it, is now higher than ever before. Think of Data Science as a large tree from which all things related to data branch out - from plain old data management and analysis to Big Data, and more.  Even the recently booming trends in Artificial Intelligence such as machine learning and deep learning are applied in many ways within data science. Data science continues to be a lucrative and growing job market in the recent years, as evidenced by the graph below: Source: Indeed.com In this article, we look at some of the high paying, high trending job roles in the data science domain that you should definitely look out for if you’re considering data science as a serious career opportunity. Let’s get started with the obvious and the most popular role. Data Scientist Dubbed as the sexiest job of the 21st century, data scientists utilize their knowledge of statistics and programming to turn raw data into actionable insights. From identifying the right dataset to cleaning and readying the data for analysis, to gleaning insights from said analysis, data scientists communicate the results of their findings to the decision makers. They also act as advisors to executives and managers by explaining how the data affects a particular product or process within the business so that appropriate actions can be taken by them. Per Salary.com, the median annual salary for the role of a data scientist today is $122,258, with a range between $106,529 to $137,037. The salary is also accompanied by a whole host of benefits and perks which vary from one organization to the other, making this job one of the best and the most in-demand, in the job market today. This is a clear testament to the fact that an increasing number of businesses are now taking the value of data seriously, and want the best talent to help them extract that value. There are over 20,000 jobs listed for the role of data scientist, and the demand is only growing. Source: Indeed.com To become a data scientist, you require a bachelor’s or a master’s degree in mathematics or statistics and work experience of more than 5 years in a related field. You will need to possess a unique combination of technical and analytical skills to understand the problem statement and propose the best solution, good programming skills to develop effective data models, and visualization skills to communicate your findings with the decision makers. Interesting in becoming a data scientist? Here are some resources to help you get started: Principles of Data Science Data Science Algorithms in a Week Getting Started with R for Data Science [Video] For a more comprehensive learning experience, check out our skill plan for becoming a data scientist on Mapt, our premium skills development platform. Data Analyst Probably a term you are quite familiar with, Data Analysts are responsible for crunching large amounts of data and analyze it to come to appropriate logical conclusions. Whether it’s related to pure research or working with domain-specific data, a data analyst’s job is to help the decision-makers’ job easier by giving them useful insights. Effective data management, analyzing data, and reporting results are some of the common tasks associated with this role. How is this role different than a data scientist, you might ask. While data scientists specialize in maths, statistics and predictive analytics for better decision making, data analysts specialize in the tools and components of data architecture for better analysis. Per Salary.com, the median annual salary for an entry-level data analyst is $55,804, and the range usually falls between $50,063 to $63,364 excluding bonuses and benefits. For more experienced data analysts, this figure rises to around a mean annual salary of $88,532. With over 83,000 jobs listed on Indeed.com, this is one of the most popular job roles in the data science community today. This profile requires a pretty low starting point, and is justified by the low starting salary packages. As you gain more experience, you can move up the ladder and look at becoming a data scientist or a data engineer. Source: Indeed.com You may also come across terms such as business data analyst or simply business analyst which are sometimes interchangeably used with the role of a data analyst. While their primary responsibilities are centered around data crunching, business analysts model company infrastructure, while data analysts model business data structures. You can find more information related to the differences in this interesting article. If becoming a data analyst is something that interests you, here are some very good starting points: Data Analysis with R Python Data Analysis, Second Edition Learning Python Data Analysis [Video] Data Architect Data architects are responsible for creating a solid data management blueprint for an organization. They are primarily responsible for designing the data architecture and defining how data is stored, consumed and managed by different applications and departments within the organization. Because of these critical responsibilities, a data architect’s job is a very well-paid one. Per Salary.com, the median annual salary for an entry-level data architect is $74,809, with a range between $57,964 to $91,685. For senior-level data architects, the median annual salary rises up to $136,856, with a range usually between $121,969 to $159,212. These high figures are justified by the critical nature of the role of a data architect - planning and designing the right data infrastructure after understanding the business considerations to get the most value out of the data. At present, there are over 23,000 jobs for the role listed on Indeed.com, with a stable trend in job seeker interest, as shown: Source: Indeed.com To become a data architect, you need a bachelor’s degree in computer science, mathematics, statistics or a related field, and loads of real-world skills to qualify for even the entry-level positions. Technical skills such as statistical modeling, knowledge of languages such as Python and R, database architectures, Hadoop-based skills, knowledge of NoSQL databases, and some machine learning and data mining are required to become a data architect. You also need strong collaborative skills, problem-solving, creativity and the ability to think on your feet, to solve the trickiest of problems on the go. Suffice to say it’s not an easy job, but it is definitely a lucrative one! Get ahead of the curve, and start your journey to becoming a data architect now: Big Data Analytics Hadoop Blueprints PostgreSQL for Data Architects Data Engineer Data engineers or Big Data engineers are a crucial part of the organizational workforce and work in tandem with data architects and data scientists to ensure appropriate data management systems are deployed and the right kind of data is being used for analysis. They deal with messy, unstructured Big Data and strive to provide clean, usable data to the other teams within the organization. They build high-performance analytics pipelines and develop set of processes for efficient data mining. In many companies, the role of a data engineer is closely associated with that of a data architect. While an architect is responsible for the planning and designing stages of the data infrastructure project, a data engineer looks after the construction, testing and maintenance of the infrastructure. As such data engineers tend to have a more in-depth understanding of different data tools and languages than data architects. There are over 90,000 jobs listed on Indeed.com, suggesting there is a very high demand in the organizations for this kind of a role. An entry level data engineer has a median annual salary of $90,083 per Payscale.com, with a range of $60,857 to $131,851. For Senior Data Engineers, the average salary shoots up to $123,749 as per Glassdoor estimates. Source: Indeed.com With the unimaginable rise in the sheer volume of data, the onus is on the data engineers to build the right systems that empower the data analysts and data scientists to sift through the messy data and derive actionable insights from it. If becoming a data engineer is something that interests you, here are some of our products you might want to look at: Real-Time Big Data Analytics Apache Spark 2.x Cookbook Big Data Visualization You can also check out our detailed skill plan on becoming a Big Data Engineer on Mapt. Chief Data Officer There is a countless number of organizations that build their businesses on data, but don’t manage it that well. This is where a senior executive popularly known as the Chief Data Officer (CDO) comes into play - bearing the responsibility for implementing the organization’s data and information governance and assisting with data-driven business strategies. They are primarily responsible for ensuring that their organization gets the most value out of their data and put appropriate plans in place for effective data quality and its life-cycle management. The role of a CDO is one of the most lucrative and highest paying jobs in the data science frontier. An average median annual pay for a CDO per Payscale.com is around $192,812. Indeed.com lists just over 8000 job postings too - this is not a very large number, but understandable considering the recent emergence of the role and because it’s a high profile, C-suite job. Source: Indeed.com According to a Gartner research, almost 50% companies in a variety of regulated industries will have a CDO in place, by 2017. Considering the demand for the role and the fact that it is only going to rise in the future, the role of a CDO is one worth vying for.   To become a CDO, you will obviously need a solid understanding of statistical, mathematical and analytical concepts. Not just that, extensive and high-value experience in managing technical teams and information management solutions is also a prerequisite. Along with a thorough understanding of the various Big Data tools and technologies, you will need strong communication skills and deep understanding of the business. If you’re planning to know more about how you can become a Chief Data Officer, you can browse through our piece on the role of CDO. Why demand for data science professionals will rise It’s hard to imagine an organization which doesn’t have to deal with data, but it’s harder to imagine the state of an organization with petabytes of data and not knowing what to do with it. With the vast amounts of data, organizations deal with these days, the need for experts who know how to handle the data and derive relevant and timely insights from it is higher than ever. In fact, IBM predicts there’s going to be a severe shortage of data science professionals, and thereby, a tremendous growth in terms of job offers and advertised openings, by 2020. Not everyone is equipped with the technical skills and know-how associated with tasks such as data mining, machine learning and more. This is slowly creating a massive void in terms of talent that organizations are looking to fill quickly, by offering lucrative salaries and added benefits. Without the professional expertise to turn data into actionable insights, Big Data becomes all but useless.      
Read more
  • 0
  • 0
  • 4089
article-image-deep-learning-indaba-presents-the-state-of-natural-language-processing-in-2018
Sugandha Lahoti
12 Dec 2018
5 min read
Save for later

Deep Learning Indaba presents the state of Natural Language Processing in 2018

Sugandha Lahoti
12 Dec 2018
5 min read
The ’Strengthening African Machine Learning’ conference organized by Deep Learning Indaba, at Stellenbosch, South Africa, is ongoing right now. This 6-day conference will celebrate and strengthen machine learning in Africa through state-of-the-art teaching, networking, policy debate, and through support programmes. Yesterday, three conference organizers, Sebastian Ruder, Herman Kamper, and Stephan Gouws asked tech experts their view on the state of Natural Language Processing, more specifically these 4 questions: What do you think are the three biggest open problems in Natural Language Processing at the moment? What would you say is the most influential work in Natural Language Processing in the last decade, if you had to pick just one? What, if anything, has led the field in the wrong direction? What advice would you give a postgraduate student in Natural Language Processing starting their project now? The tech experts interviewed included the likes of Yoshua Bengio, Hal Daumé III, Barbara Plank, Miguel Ballesteros, Anders Søgaard, Lea Frermann, Michael Roth, Annie Louise, Chris Dyer, Felix Hill,  Kevin Knight and more. https://twitter.com/seb_ruder/status/1072431709243744256 Biggest open problems in Natural Language Processing at the moment Although each expert talked about a variety of Natural Language Processing open issues, the following common key themes recurred. No ‘real’ understanding of Natural language understanding Many experts argued that natural Language understanding is central and also important for natural language generation. They agreed that most of our current Natural Language Processing models do not have a “real” understanding. What is needed is to build models that incorporate common sense, and what (biases, structure) should be built explicitly into these models. Dialogue systems and chatbots were mentioned in several responses. Maletšabisa Molapo, a Research Scientist at IBM Research and one of the experts answered, “Perhaps this may be achieved by general NLP Models, as per the recent announcement from Salesforce Research, that there is a need for NLP architectures that can perform well across different NLP tasks (machine translation, summarization, question answering, text classification, etc.)” NLP for low-resource scenarios Another open problem is using NLP for low-resource scenarios. This includes generalization beyond the training data, learning from small amounts of data and other techniques such as Domain-transfer, transfer learning, multi-task learning. Also includes different supervised learning techniques, semi-supervised, weakly-supervised, “Wiki-ly” supervised, distantly-supervised, lightly-supervised, minimally-supervised and unsupervised learning. Per Karen Livescu, Associate Professor Toyota Technological Institute at Chicago, “Dealing with low-data settings (low-resource languages, dialects (including social media text "dialects"), domains, etc.).  This is not a completely "open" problem in that there are already a lot of promising ideas out there; but we still don't have a universal solution to this universal problem.” Reasoning about large or multiple contexts Experts believed that NLP has problems in dealing with large contexts. These large context documents can be either text or spoken documents, which currently lack common sense incorporation. According to, Isabelle Augenstein, tenure-track assistant professor at the University of Copenhagen, “Our current models are mostly based on recurrent neural networks, which cannot represent longer contexts well. One recent encouraging work in this direction I like is the NarrativeQA dataset for answering questions about books. The stream of work on graph-inspired RNNs is potentially promising, though has only seen modest improvements and has not been widely adopted due to them being much less straight-forward to train than a vanilla RNN.” Defining problems, building diverse datasets and evaluation procedures “Perhaps the biggest problem is to properly define the problems themselves. And by properly defining a problem, I mean building datasets and evaluation procedures that are appropriate to measure our progress towards concrete goals. Things would be easier if we could reduce everything to Kaggle style competitions!” - Mikel Artetxe. Experts believe that current NLP datasets need to be evaluated. A new generation of evaluation datasets and tasks are required that show whether NLP techniques generalize across the true variability of human language. Also what is required are more diverse datasets. “Datasets and models for deep learning innovation for African Languages are needed for many NLP tasks beyond just translation to and from English,” said Molapo. Advice to a postgraduate student in NLP starting their project Do not limit yourself to reading NLP papers. Read a lot of machine learning, deep learning, reinforcement learning papers. A PhD is a great time in one’s life to go for a big goal, and even small steps towards that will be valued. — Yoshua Bengio Learn how to tune your models, learn how to make strong baselines, and learn how to build baselines that test particular hypotheses. Don’t take any single paper too seriously, wait for its conclusions to show up more than once. — George Dahl I believe scientific pursuit is meant to be full of failures. If every idea works out, it’s either because you’re not ambitious enough, you’re subconsciously cheating yourself, or you’re a genius, the last of which I heard happens only once every century or so. so, don’t despair! — Kyunghyun Cho Understand psychology and the core problems of semantic cognition. Understand machine learning. Go to NeurIPS. Don’t worry about ACL. Submit something terrible (or even good, if possible) to a workshop as soon as you can. You can’t learn how to do these things without going through the process. — Felix Hill Make sure to go through the complete list of all expert responses for better insights. Google open sources BERT, an NLP pre-training technique Use TensorFlow and NLP to detect duplicate Quora questions [Tutorial] Intel AI Lab introduces NLP Architect Library  
Read more
  • 0
  • 0
  • 3871

article-image-all-of-my-engineering-teams-have-a-machine-learning-feature-on-their-roadmap-will-ballard-talks-artificial-intelligence-in-2019-interview
Packt Editorial Staff
02 Jan 2019
3 min read
Save for later

“All of my engineering teams have a machine learning feature on their roadmap” - Will Ballard talks artificial intelligence in 2019 [Interview]

Packt Editorial Staff
02 Jan 2019
3 min read
The huge advancements of deep learning and artificial intelligence were perhaps the biggest story in tech in 2018. But we wanted to know what the future might hold - luckily, we were able to speak to Packt author Will Ballard about what they see as in store for artificial in 2019 and beyond. Will Ballard is the chief technology officer at GLG, responsible for engineering and IT. He was also responsible for the design and operation of large data centers that helped run site services for customers including Gannett, Hearst Magazines, NFL, NPR, The Washington Post, and Whole Foods. He has held leadership roles in software development at NetSolve (now Cisco), NetSpend, and Works (now Bank of America). Explore Will Ballard's Packt titles here. Packt: What do you think the biggest development in deep learning / AI was in 2018? Will Ballard: I think attention models beginning to take the place of recurrent networks is a pretty impressive breakout on the algorithm side. In Packt’s 2018 Skill Up survey, developers across disciplines and job roles identified machine learning as the thing they were most likely to be learning in the coming year. What do you think of that result? Do you think machine learning is becoming a mandatory multidiscipline skill, and why? Almost all of my engineering teams have an active, or a planned machine learning feature on their roadmap. We’ve been able to get all kinds of engineers with different backgrounds to use machine learning -- it really is just another way to make functions -- probabilistic functions -- but functions. What do you think the most important new deep learning/AI technique to learn in 2019 will be, and why? In 2019 -- I think it is going to be all about PyTorch and TensorFlow 2.0, and learning how to host these on cloud PaaS. The benefits of automated machine learning and metalearning How important do you think automated machine learning and metalearning will be to the practice of developing AI/machine learning in 2019? What benefits do you think they will bring? Even ‘simple’ automation techniques like grid search and running multiple different algorithms on the same data are big wins when mastered. There is almost no telling which model is ‘right’ till you try it, so why not let a cloud of computers iterate through scores of algorithms and models to give you the best available answer? Artificial intelligence and ethics Do you think ethical considerations will become more relevant to developing AI/machine learning algorithms going forwards? If yes, how do you think this will be implemented? I think the ethical issues are important on outcomes, and on how models are used, but aren’t the place of algorithms themselves. If a developer was looking to start working with machine learning/AI, what tools and software would you suggest they learn in 2019? Python and PyTorch.
Read more
  • 0
  • 0
  • 3815

article-image-data-science-saved-christmas
Aaron Lazar
22 Dec 2017
9 min read
Save for later

How Data Science saved Christmas

Aaron Lazar
22 Dec 2017
9 min read
It’s the middle of December and it’s shivery cold in the North Pole at -20°C. A fat old man sits on a big brown chair, beside the fireplace, stroking his long white beard. His face has a frown on it, quite his unusual self. Mr. Claus quips, “Ruddy mailman should have been here by now! He’s never this late to bring in the li'l ones’ letters.” [caption id="attachment_3284" align="alignleft" width="300"] Nervous Santa Claus on Christmas Eve, he is sitting on the armchair and resting head on his hands[/caption] Santa gets up from his chair, his trouser buttons crying for help, thanks to his massive belly. He waddles over to the window and looks out. He’s sad that he might not be able to get the children their gifts in time, this year. Amidst the snow, he can see a glowing red light. “Oh Rudolph!” he chuckles. All across the living room are pictures of little children beaming with joy, holding their presents in their hands. A small smile starts building and then suddenly, Santa gets a new-found determination to get the presents over to the children, come what may! An idea strikes him as he waddles over to his computer room. Now Mr. Claus may be old on the outside, but on the inside, he’s nowhere close! He recently set up a new rig, all by himself. Six Nvidia GTX Titans, coupled with sixteen gigs of RAM, a 40-inch curved monitor that he uses to keep an eye on who’s being naughty or nice, and a 1000 watt home theater system, with surround sound, heavy on the bass. On the inside, he’s got a whole load of software on the likes of the Python language (not the Garden of Eden variety), OpenCV - his all-seeing eye that’s on the kids and well, Tensorflow et al. Now, you might wonder what an old man is doing with such heavy software and hardware. A few months ago, Santa caught wind that there’s a new and upcoming trend that involves working with tonnes of data, cleaning, processing and making sense of it. The idea of crunching data somehow tickled the old man and since then, the jolly good master tinkerer and his army of merry elves have been experimenting away with data. Santa’s pretty much self-taught at whatever he does, be it driving a sleigh or learning something new. A couple of interesting books he picked up from Packt were, Python Data Science Essentials - Second Edition, Hands-On Data Science and Python Machine Learning, and Python Machine Learning - Second Edition. After spending some time on the internet, he put together a list of things he needed to set up his rig and got them from Amazon. [caption id="attachment_3281" align="alignright" width="300"] Santa Claus is using a laptop on the top of a house[/caption] He quickly boots up the computer and starts up Tensorflow. He needs to come up with a list of probable things that each child would have wanted for Christmas this year. Now, there are over 2 billion children in the world and finding each one’s wish is going to be more than a task! But nothing is too difficult for Santa! He gets to work, his big head buried in his keyboard, his long locks falling over his shoulder. So, this was his plan: Considering that the kids might have shared their secret wish with someone, Santa plans to tackle the problem from different angles, to reach a higher probability of getting the right gifts: He plans to gather email and Social Media data from all the kids’ computers - all from the past month It’s a good thing kids have started owning phones at such an early age now - he plans to analyze all incoming and outgoing phone calls that have happened over the course of the past month He taps into every country's local police department’s records to stream all security footage all over the world [caption id="attachment_3288" align="alignleft" width="300"] A young boy wearing a red Christmas hat and red sweater is writing a letter to Santa Claus. The child is sitting at a wooden table in front of a Christmas tree.[/caption] If you’ve reached till here, you’re probably wondering whether this article is about Mr.Claus or Mr.Bond. Yes, the equipment and strategy would have fit an MI6 or a CIA agent’s role. You never know, Santa might just be a retired agent. Do they ever retire? Hmm! Anyway, it takes a while before he can get all the data he needs. He trusts Spark to sort this data in order, which is stored in a massive data center in his basement (he’s a bit cautious after all the news about data breaches). And he’s off to work! He sifts through the emails and messages, snorting from time to time at some of the hilarious ones. Tensorflow rips through the data, picking out keywords for Santa. It takes him a few hours to get done with the emails and social media data alone! By the time he has a list, it’s evening and time for supper. Santa calls it a day and prepares to continue the next day. The next day, Santa gets up early and boots up his equipment as he brushes and flosses. He plonks himself in the huge swivel chair in front of the monitor, munching on freshly baked gingerbread. He starts tapping into all the phone company databases across the world, fetching all the data into his data center. Now, Santa can’t afford to spend the whole time analyzing voices himself, so he lets Tensorflow analyze voices and segregate the keywords it picks up from the voice signals. Every kid’s name to a possible gift. Now there were a lot of unmentionable things that got linked to several kids names. Santa almost fell off his chair when he saw the list. “These kids grow up way too fast, these days!” It’s almost 7 PM in the evening when Santa realizes that there’s way too much data to process in a day. A few days later, Santa returns to his tech abode, to check up on the progress of the call data processing. There’s a huge list waiting in front of him. He thinks to himself, “This will need a lot of cleaning up!” He shakes his head thinking, I should have started with this! He now has to munge through that camera footage! Santa had never worked on so much data before so he started to get a bit worried that he might be unable to analyze it in time. He started pacing around the room trying to think up a workaround. Time was flying by and he still did not know how to speed up the video analyses. Just when he’s about to give up, the door opens and Beatrice walks in. Santa almost trips as he runs to hug his wife! Beatrice is startled for a bit but then breaks into a smile. “What is it dear? Did you miss me so much?” Santa replies, “You can’t imagine how much! I’ve been doing everything on my own and I really need your help!” Beatrice smiles and says, “Well, what are we waiting for? Let’s get down to it!” Santa explains the problem to Beatrice in detail and tells her how far he’s reached in the analysis. Beatrice thinks for a bit and asks Santa, “Did you try using Keras on top of TensorFlow?” Santa, blank for a minute, nods his head. Beatrice continues, “Well from my experience, Keras gives TensorFlow a boost of about 10%, which should help quicken the analysis. Santa just looks like he’s made the best decision marrying Beatrice and hugs her again! “Bea, you’re a genius!” he cries out. “Yeah, and don’t forget to use Matplotlib!” she yells back as Santa hurries back to his abode. He’s off to work again, this time saddling up Keras to work on top of TensorFlow. Hundreds and thousands of terabytes of video data flowing into the machines. He channels the output through OpenCV and ties it with TensorFlow to add a hint of Deep Learning. He quickly types out some Python scripts to integrate both the tools to create the optimal outcome. And then the wait begins. Santa keeps looking at his watch every half hour, hoping that the processing happens fast. The hardware has begun heating up quite a bit and he quickly races over to bring a cooler that’s across the room. While he waits for the videos to finish up, he starts working on sifting out the data from the text and audio. He remembers what Beatrice said and uses Matplotlib to visualize it. Soon he has a beautiful map of the world with all the children’s names and their possible gifts beside. Three days later, the video processing gets done Keras truly worked wonders for TensorFlow! Santa now has another set of data to help him narrow down the gift list. A few hours later he’s got his whole list visualized on Matplotlib. [caption id="attachment_3289" align="alignleft" width="300"] Santa Claus riding on sleigh with gift box against snow falling on fir tree forest[/caption] There’s one last thing left to do! He suits up in red and races out the door to Rudolph and the other reindeer, unties them from the fence and leads them over to the sleigh. Once they’re fastened, he loads up an empty bag onto the sleigh and it magically gets filled up. He quickly checks it to see if all is well and they’re off! It’s Christmas morning and all the kids are racing out of bed to rip their presents open! There are smiles all around and everyone’s got a gift, just as the saying goes! Even the ones who’ve been naughty have gotten gifts. Back in the North Pole, the old man is back in his abode, relaxing in an easy chair with his legs up on the table. The screen in front of him runs real-time video feed of kids all over the world opening up their presents. A big smile on his face, Santa turns to look out the window at the glowing red light amongst the snow, he takes a swig of brandy from a hip flask. Thanks to Data Science, this Christmas is the merriest yet!
Read more
  • 0
  • 0
  • 3744
article-image-tensorflow-js-contributor-kai-sasaki-on-how-tensorflow-js-eases-web-based-machine-learning-application-development
Sugandha Lahoti
28 Nov 2019
6 min read
Save for later

TensorFlow.js contributor Kai Sasaki on how TensorFlow.js eases web-based machine learning application development

Sugandha Lahoti
28 Nov 2019
6 min read
Running Machine Learning applications on the web browser is one of the hottest trends in software development right now. Many notable machine learning projects are being built with Tensorflow.js. It is one of the most popular frameworks for building performant machine learning applications that run smoothly in a web browser. Recently, we spoke with Kai Sasaki, who is one of the initial contributors to TensorFlow.js. He talked about current and future versions of TF.js, how it compares to other browser-based ML tools and his contributions to the community. He also shared his views on why he thinks Javascript good for Machine Learning. If you are a web developer with working knowledge of Javascript who wants to learn how to integrate machine learning techniques with web-based applications, we recommend you to read the book, Hands-on Machine Learning with TensorFlow.js. This hands-on course covers important aspects of machine learning with TensorFlow.js using practical examples. Throughout the course, you'll learn how different algorithms work and follow step-by-step instructions to implement them through various examples. On how TensorFlow.js has improved web-based machine learning How do you think Machine Learning for the Web has evolved in the last 2-3 years? What are some current applications of web-based machine learning and TensorFlow.js? What can we expect in future releases? Machine Learning on the web platform is a field attracting more developers and machine learning practitioners. There are two reasons. First, the web platform is universally available. The web browser mostly provides us a way to access the underlying resource transparently. The second reason is security.raining a model on the client-side means you can keep sensitive data inside the client environment as the entire training process is completed on the client-side itself. The data is not sent to the cloud, making it more secure and less susceptible to vulnerabilities or hacking. In future releases as well, TensorFlow.js is expected to provide more secure and accessible functionalities. You can find various kinds of TensorFlow.js based applications here. How does TensorFlow.js compare with other web and browser-based machine learning tools? Does it make web-based machine learning application development easier? The most significant advantage of TensorFlow.js is the full compatibility of the TensorFlow ecosystem. Not only can a TensorFlow model be seamlessly used in TensorFlow.js, tools for visualization and model deployment in the TensorFlow ecosystem can also be used in TensorFlow.js. TensorFlow 2 was released in October. What are some new changes made specific to TensorFlow.js as a part of TF 2.0 that machine learning developers will find useful? What are your first impressions of this new release? Although there is nothing special related to TensorFlow 2.0, the full support of new backends is actively developed, such as WASM and WebGPU. These hardware acceleration mechanisms provided by the web platform can enhance performance for any TensorFlow.js application. It surely makes the potential of TensorFlow.js stronger and possible use cases broader. On Kai’s experience working on his book, Hands-on Machine Learning with TensorFlow.js Tell us the motivation behind writing your book Hands-on Machine Learning with TensorFlow.js. What are some of your favorite chapters/projects from the book? TensorFlow.js does not have much history because only three years have passed since its initial publication. Due to the lack of resources to learn TensorFlow.js usage, I was motivated to write a book illustrating how to using TensorFlow.js practically. I think chapters 4 - 9 of my book Hands-On Machine Learning with TensorFlow.js provide readers good material to practice how to write the ML application with TensorFlow.js. Why Javascript for Machine Learning Why do you think Javascript is good for Machine Learning? What are some of the good machine learning packages available in Javascript? How does it compare to other languages like Python, R, Matlab, especially in terms of performance? JavaScript is a primary programming language in the web platform so it can work as a bridge between the web and machine learning applications. We have several other libraries working similarly. For example, machinelearn.js is a general machine learning framework running with JavaScript. Although JavaScript is not a highly performant language, its universal availability in the web platform is attractive to developers as they can build their machine learning applications that are “write once, run anywhere”. We can compare the performance by running state-of-the-art machine learning models such as MobileNet or ResNet practically. On his contribution towards TF.js You are a contributor for TensorFlow.js and were awarded by the Google Open Source Peer Bonus Program. What were your main contributions? How was your experience working for TF.js? One of the significant contributions I have made was fast Fourier transformation operations. I have created the initial implementation of fft, ifft, rfft and irfft. I also added stft (short term Fourier transformation). These operators are mainly used for performing signal analysis for audio applications. I have done several bug fixes and test enhancements in TensorFlow.js too. What are the biggest challenges today in the field of Machine Learning and AI in web development? What do you see as some of the greatest technology disruptors in the next 5 years? While many developers are writing Python programming languages in the machine learning field, not many web developers have familiarity and knowledge of machine learning in spite of the substantial advantage of the integration between machine learning and web platform. I believe machine learning technologies will be democratized among web developers so that a vast amount of creativity is flourished in the next five years. By cooperating with these enthusiastic developers in the community, I believe the machine learning on the client-side or edge device will be one of the major contributions in the machine learning field. About the author Kai Sasaki works as a software engineer in Treasure Data to build large-scale distributed systems. He is one of the initial contributors to TensorFlow.js and contributes to developing operators for newer machine learning models. He has also received the Google Open Source Peer Bonus in 2018. You can find him on Twitter, Linkedin, and GitHub. About the book Hands-On Machine Learning with TensorFlow.js is a comprehensive guide that will help you easily get started with machine learning algorithms and techniques using TensorFlow.js. Throughout the course, you'll learn how different algorithms work and follow step-by-step instructions to implement them through various examples. By the end of this book, you will be able to create and optimize your own web-based machine learning applications using practical examples. Baidu adds Paddle Lite 2.0, new development kits, EasyDL Pro, and other upgrades to its PaddlePaddle platform. Introducing Spleeter, a Tensorflow based python library that extracts voice and sound from any music track TensorFlow 2.0 released with tighter Keras integration, eager execution enabled by default, and more!
Read more
  • 0
  • 0
  • 3703

article-image-nips-2017-deep-bayesian-bayesian-deep-learning-yee-whye-teh
Savia Lobo
15 Dec 2017
8 min read
Save for later

NIPS 2017 Special: A deep dive into Deep Bayesian and Bayesian Deep Learning with Yee Whye Teh

Savia Lobo
15 Dec 2017
8 min read
Yee Whye Teh is a professor at the department of Statistics of the University of Oxford and also a research scientist at DeepMind. He works on statistical machine learning, focussing on Bayesian nonparametrics, probabilistic learning, and deep learning. The motive of this article aims to bring our readers to Yee’s keynote speech at the NIPS 2017. Yee’s keynote ponders deeply on the interface between two perspectives on machine learning: Bayesian learning and Deep learning by exploring questions like: How can probabilistic thinking help us understand deep learning methods or lead us to interesting new methods? Conversely, how can deep learning technologies help us develop advanced probabilistic methods? For a more comprehensive and in-depth understanding of this novel approach, be sure to watch the complete keynote address by Yee Whye Teh on  NIPS facebook page. All images in this article come from Yee’s presentation slides and do not belong to us. The history of machine learning has shown a growth in both model complexity and in model flexibility. The theory led models have started to lose their shine. This is because machine learning is at the forefront of a revolution that could be called as data led models or the data revolution. As opposed to theory led models, data-led models try not to impose too many assumptions on the processes that have to be modeled and are rather superflexible non-parametric models that can capture the complexities but they require large amount of data to operate.   On the model flexibility side, we have various approaches that have been explored over the years. We have kernel methods, Gaussian processes, Bayesian nonparametrics and now we have deep learning as well. The community has also developed evermore complex frameworks both graphical and programmatic to compose large complex models from simpler building blocks. In the 90’s we had graphical models, later we had probabilistic programming systems, followed by deep learning systems like TensorFlow, Theano, and Torch. A recent addition is probabilistic Torch, which brings together ideas from both the probabilistic Bayesian learning and deep learning. On one hand we have Bayesian learning, which deals with learning as inference in some probabilistic models. On the other hand we have deep learning models, which view learning as optimization functions parametrized by neural networks. In recent years there has been an explosion of exciting research at this interface of these two popular approaches resulting in increasingly complex and exciting models. What is Bayesian theory of learning Bayesian learning describes an ideal learner as one who interacts with the world in order to know its state, which is given by θ. He/she makes some observations about the world by deducing a model in Bayesian context. This model is a joint distribution of both the unknown state of the world θ and the observation about the world x. The model consists of prior distribution and marginal distribution, combining which gives a reverse conditional distribution also known as posterior, which describes the totality of the agent's knowledge about the world after he/she sees x. This posterior can also be used for predicting future observations and act accordingly. Issues associated with Bayesian learning Rigidity Learning can be wrong if model is wrong Not all prior knowledge can be encoded as joint distribution Simple analytic forms are limiting for conditional distributions 2. Scalability: Intractable to compute this posterior and approximations have to be made, which then introduces trade offs between efficiency and accuracy. As a result, it is often assumed that Bayesian techniques are not scalable. To address these issues, the speaker highlights some of his recent projects which showcase scenarios where deep learning ideas are applied to Bayesian models (Deep Bayesian learning) or in the reverse applying Bayesian ideas to Neural Networks ( i.e. Bayesian Deep learning) Deep Bayesian learning: Deep learning assists Bayesian learning Deep learning can improve Bayesian learning in the following ways: Improve the modeling flexibility by using neural networks in the construction of Bayesian models Improve the inference and scalability of these methods by parameterizing the posterior way of using neural networks Empathizing inference over multiple runs These can be seen in the following projects showcased by Yee: Concrete VAEs(Variational Autoencoders) FIVO: Filtered Variational Objectives Concrete VAEs What are VAEs? All the qualities mentioned above, i.e. improving modeling flexibility, improving inference and scalability, and empathizing inference over multiple runs by using neural networks can be seen in a class of deep generative models known as VAE (Variational Autoencoders). Fig: Variational Autoencoders VAEs include latent variables that describe the contents of a scene i.e objects, pose. The relationship between these latent variables and the pixels have to be highly complex and nonlinear. So, in short, VAEs are used to parameterize generative and variable posterior distribution that allows for greater scope flexible modeling. The key that makes VAEs work is the reparameterization trick Fig: Adding reparameterization to VAEs The reparameterization trick is crucial to the continuous latent variables in the VAEs. But many models naturally include discrete latent variables. Yee suggests application of the reparameterization on the discrete latent variables as a work around. This brings us to the concept of Concrete VAEs.. CONtinuous relaxation of disCRETE distributions.Also, the density can be further calculated: This concrete distribution is the reparameterization trick for discrete variables which helps in calculating the KL divergence that is needed for variational inference. FIVO: Filtered Variational Objectives FIVO extends VAEs towards models for sequential and time series data. It is built upon another extension of VAEs known as Importance Weighted Autoencoder, a generative model with a similar as that of the VAE, but which uses a strictly tighter log-likelihood lower bound. Variational lower bound: Rederivation from importance sampling: Better to use multiple samples: Using Importance Weighted Autoencoders we can use multiple sampling, with which we can get a tighter lower bound and optimizing this lower bound should lead to better learning. Let’s have a look at the FIVO objectives: We can use any unbiased estimator p(X) of marginal probabilityTightness of bound related to variance of estimatorFor sequential models, we can use particle filters which produce unbiased estimator of marginal probability. They can also have much lower variance than importance samplers. Bayesian Deep learning: Bayesian approach for deep learning gives us counterintuitive and surprising ways to make deep learning scalable. In order to explore the potential of Bayesian learning with deep neural networks, Yee introduced a project named, The posterior server. The Posterior server The posterior server is a distributed server for deep learning. It makes use of the Bayesian approach in order to make neural networks highly scalable. This project focuses on Distributed learning, where both the data and the computations can be spread across the network. The figure above shows that there are a bunch of workers and each communicates with the parameter server, which effectively maintains the authoritative copy of the parameters of the network. At each iteration, each worker obtains the latest copy of the parameter from the server, computes the gradient update based on its data and sends it back to the server which then updates it to the authoritative copy. So, communications on the network tend to be slower than the computations that can be done on the network. Hence, one might consider multiple gradient steps on each iteration before it sends the accumulated update back to the parameter server. The problem is that the parameter and the worker quickly get out of sync with the authoritative copy on the parameter server. As a result, this leads to stale updates which allow noise into the system and we often need frequent synchronizations across the network for the algorithm to learn in a stable fashion. The main idea here in Bayesian context is that we don't just want a single parameter, we want a whole distribution over them. This will then relax the need for frequent synchronizations across the network and hopefully lead to algorithms that are robust to last frequent communication. Each worker is simply going to construct its own tractable approximation to his own likelihood function and send this information to the posterior server which then combines these approximations together to form the full posterior or an approximation of it. Further, the approximations that are constructed would be based on the statistics of some sampling algorithms that happens locally on that worker. The actual algorithm includes a combination of the variational algorithms, Stochastic Gradient EP and the Markov chain Monte Carlo on the workers themselves. So the variational part in the algorithm handles the communication part in the network whereas the MCMC part handles the sampling part that is posterior to construct the statistics that the variational part needs. For scalability, a stochastic gradient Langevin algorithm which is a simple generalization of the SGT, which includes additional injected noise, to sample from posterior noise. To experiment with this server, it was trained densely connected neural networks with 500 reLU units on MNIST dataset. You can have a detailed understanding of these examples in the keynote video. This interface between Bayesian learning and deep learning is a very exciting frontier. Researchers have brought management of uncertainties within deep learning. Also, flexibility and scalability in Bayesian modeling. Yee concludes with two questions for the audience to think about. Does being Bayesian in the space of functions makes more sense than being Bayesian in the sense of parameters? How to deal with uncertainties under model misspecification?    
Read more
  • 0
  • 0
  • 3611