Data | 0 articles | Tech News, Tutorials & Expert Insights

article-image-alibaba-fashionai-artificial-intelligence-shopping

15 Nov 2017

3 min read

Digitizing the offline: How Alibaba’s FashionAI can revive the waning retail industry

15 Nov 2017

Imagine a visit to a store when you end up forgetting what you wanted to purchase, but the screen at front tells you all about your preferences. Before you struggle to pick a selection, the system tells you what you could possibly try and which items you could end up purchasing. All this is happening in the malls of China, thanks to artificial intelligence. Thanks to the Alibaba Group which is on a mission to revive the offline retail market. Ever since inception in 2009, the Singles Day festival has been considered the biggest opportunity for shopping in China. There was no better occasion for Alibaba to test its artificial intelligence product FashionAI. And this year, the sales zoomed. Alibaba’s gross merchandise volume gained a mammoth 25 billion dollars on Nov. 11, breaking its last year’s gross figure of $17.8 billion by a considerable margin. Majorly attributed to the FashionAI initiative. At a time when offline retail is in decline all across the world, Alibaba’s Fashion AI could single-handedly reinvent the market. It will drag you back to the malls. When the tiny sensors embedded in the cloth you just tried suggest you all ‘matching’ items, you do not mind visiting the stores with an easy to use recognizable interface at front that saves you from all the old drudgeries in retail shopping! The FashionAI screen interface uses machine learning for its suggestions, based on the items being tried on. It extends the information stored into the product tag to generate recommendations. Using the system, a customer can try clothes on, get related ‘smart’ suggestions from the AI, and then finalize a selection on the screen. But most importantly, the AI assistant doesn’t intend to replace humans with robots. It instead integrates them together to deliver better service. So when you want to try something different, you can click a button and a store attendant will be right there. Why are we cutting down on human intervention then? The point is, it is nearly impossible for a human staff to remember the shopping preferences of each customer, whereas an artificial intelligence system can do it with scale. This is why researchers thought to apply deep learning for real world scenarios like these. Unlike the human store attendant who gets irked with your massive shopping tantrums in terms of choices, the AI system has been programmed to rather leverage the big data for making ‘smart’ decisions. What more, it gets better with time, and learns more and more based on the suggested inputs from surroundings. We could say FashionAI is still an experiment. But Alibaba is on course to create history, if the Chinese e-commerce giant succedes in fueling life back into retail. As CEO Daniel Zhang reiterated, Alibaba is going to “digitize the offline retail world.” This is quite a first for such a line of thinking. But then, customers don’t recognize offline and online till it serves their interest.

0
0
2140

article-image-dr-brandon-explains-transfer-learning

Shoaib Dabir

15 Nov 2017

5 min read

Dr. Brandon explains 'Transfer Learning' to Jon

Shoaib Dabir

15 Nov 2017

5 min read

[box type="shadow" align="" class="" width=""]Dr. Brandon: Hello and welcome to another episode of 'Date with Data Science'. Today we are going to talk about a topic that is all the rage these days in the data science community: Transfer Learning. Jon: 'Transfer learning' sounds all sci-fi to me. Is it like the thing that Prof. X does in X-men reading other people's minds using that dome-like headset thing in his chamber? Dr. Brandon: If we are going to get X-men involved, what Prof. X does is closer to deep learning. We will talk about that another time. Transfer learning is simpler to explain. It's what you actually do everytime you get into some character, Jon. Say, you are given the role of Jack Sparrow to play. You will probably read a lot about pirates, watch a lot of pirate movies and even Jonny Depp in character and form your own version of Jack Sparrow. Now after that acting assignment is over, say you are given the opportunity to audition for the role of Captain Hook, the famous pirate from Peter Pan. You won't do your research from ground zero this time. You will retain general mannerisms of a Pirate you learned from your previous role, but will only learn the nuances of Captain Hook, like acting one-handed. Jon: That's pretty cool! So you say machines can also learn this way? Dr.Brandon: Of course, that's what transfer learning is all about: learn something, abstract the learning sufficiently, then apply it to another related problem. The following is an excerpt from a book by Kuntal Ganguly titled Learning Generative Adversarial Networks.[/box] Pre-trained models are not optimized for tackling user specific datasets, but they are extremely useful for the task at hand that has similarity with the trained model task. For example, a popular model, InceptionV3, is optimized for classifying images on a broad set of 1000 categories, but our domain might be to classify some dog breeds. A well-known technique used in deep learning that adapts an existing trained model for a similar task to the task at hand is known as Transfer Learning. And this is why Transfer Learning has gained a lot of popularity among deep learning practitioners and in recent years has become the go-to technique in many real-life use cases. It is all about transferring knowledge (or features) among related domain. Purpose of Transfer Learning Let say you have trained a deep neural network to differentiate between fresh mango and rotten mango. During training, the network requires thousands of rotten and fresh mango images and hours of training to learn knowledge like if any fruit is rotten, a liquid will ooze out of the fruit and it produce a bad odor. Now with this training experience the network, can be used for different task/use-case to differentiate between a rotten apple and fresh apple using the knowledge of rotten features learned during training of mango images. The general approach of Transfer Learning is to train a base network and then copy its first n layers to the first n layers of a target network. The remaining layers of the target network are initialized randomly and trained toward the targeted use-case. The main scenarios for using Transfer Learning in your deep learning workflow are as follows: Smaller datasets: When you have a smaller dataset, building a deep learning model from scratch won't work well. Transfer Learning provides the way to apply a pre-trained model to new classes of data. Let's say a pre-trained model built from one million images of ImageNet data will converge to a decent solution (after training on just a fraction of the available smaller training data, for example, CIFAR-10) compared to a deep learning model built with a smaller dataset from scratch. Less resource: Deep learning process (such as convolution) requires a significant amount of resource and time. Deep learning process are well suited to run on high graded GPU-based machines. But with pre-trained models, you can easily train across a full training set (let's say 50000 images) in less than a minute using your laptop/notebook without GPU, since the majority of time a model is modified in the final layer with a simple update of just a classifier or regressor. Various approaches of using pre-trained models Using pre-trained architecture: Instead of transferring weights of the trained model, we can only use the architecture and initialize our own random weights to our new dataset. Feature extractor: A pre-trained model can be used as a feature extraction mechanism just by simply removing the output layer of the network (that gives the probabilities for being in each of the n classes) and then freezing all the previous layers of the network as a fixed feature extractor for the new dataset. Partially freezing the network: Instead of replacing only the final layer and extracting features from all previous layers, sometime we might train our new model partially (that is, to keep the weights of initial layers of the network frozen while retraining only the higher layers). Choice of the number of frozen layers can be considered as one more hyper-parameter. Next, read about how transfer learning is being used in the real world. If you enjoyed the above excerpt, do check out the book it is from.

0
0
2202

article-image-spark-h2o-sparkling-water-machine-learning-needs

Aarthi Kumaraswamy

15 Nov 2017

3 min read

Spark + H2O = Sparkling water for your machine learning needs

Aarthi Kumaraswamy

15 Nov 2017

3 min read

[box type="note" align="" class="" width=""]The following is an excerpt from the book Mastering Machine Learning with Spark, Chapter 1, Introduction to Large-Scale Machine Learning and Spark written by Alex Tellez, Max Pumperla, and Michal Malohlava. This article introduces Sparkling water - H2O's integration of their platform within the Spark project, which combines the machine learning capabilities of H2O with all the functionality of Spark. [/box] H2O is an open source, machine learning platform that plays extremely well with Spark; in fact, it was one of the first third-party packages deemed "Certified on Spark". Sparkling Water (H2O + Spark) is H2O's integration of their platform within the Spark project, which combines the machine learning capabilities of H2O with all the functionality of Spark. This means that users can run H2O algorithms on Spark RDD/DataFrame for both exploration and deployment purposes. This is made possible because H2O and Spark share the same JVM, which allows for seamless transitions between the two platforms. H2O stores data in the H2O frame, which is a columnar-compressed representation of your dataset that can be created from Spark RDD and/or DataFrame. Throughout much of this book, we will be referencing algorithms from Spark's MLlib library and H2O's platform, showing how to use both the libraries to get the best results possible for a given task. The following is a summary of the features Sparkling Water comes equipped with: Use of H2O algorithms within a Spark workflow Transformations between Spark and H2O data structures Use of Spark RDD and/or DataFrame as inputs to H2O algorithms Use of H2O frames as inputs into MLlib algorithms (will come in handy when we do feature engineering later) Transparent execution of Sparkling Water applications on top of Spark (for example, we can run a Sparkling Water application within a Spark stream) The H2O user interface to explore Spark data Design of Sparkling Water Sparkling Water is designed to be executed as a regular Spark application. Consequently, it is launched inside a Spark executor created after submitting the application. At this point, H2O starts services, including a distributed key-value (K/V) store and memory manager, and orchestrates them into a cloud. The topology of the created cloud follows the topology of the underlying Spark cluster. As stated previously, Sparkling Water enables transformation between different types of RDDs/DataFrames and H2O's frame, and vice versa. When converting from a hex frame to an RDD, a wrapper is created around the hex frame to provide an RDD-like API. In this case, data is not duplicated but served directly from the underlying hex frame. Converting from an RDD/DataFrame to a H2O frame requires data duplication because it transforms data from Spark into H2O-specific storage. However, data stored in an H2O frame is heavily compressed and does not need to be preserved as an RDD anymore: If you enjoyed this excerpt, be sure to check out the book it appears in.

0
0
3540

article-image-tensorflow-lite-developer-preview

Savia Lobo

15 Nov 2017

3 min read

Tensorflow Lite developer preview is Here

Savia Lobo

15 Nov 2017

3 min read

Team TensorFlow announces the developer preview of TensorFlow Lite, a feather-light upshot for mobile and embedded devices, at the I/O developer conference. TensorFlow has been a popular framework grabbing everyone’s attention since its inception. Its adoption can be seen right from within the enormous server racks to tiny IoT(Internet of Things) devices; now it’s time for mobile and embedded devices! Also, since TensorFlow Lite made its debut in May, many other opponents have come up with their version of AI on mobile-- Apple’s CoreML, and the Cloud service from Clarifai are some popular examples. TensorFlow Lite is available for both Android and iOS devices. TensorFlow Lite is designed to be: Lightweight: It allows inference of the on-device machine learning models that too with a small binary size, allowing faster initialization/ startup. Speed: The model loading time is dramatically improved, with an accelerated hardware support. Cross-platform: It includes a runtime tailormade to run on various platforms--starting with Android and iOS. Recently, there has been an increase in the number of mobile devices that make use of a custom-built hardware to carry out ML workloads efficiently. Keeping this in mind, TensorFlow Lite also supports the Android Neural Networks API to leverage the advantages of the new accelerators. Another feature of TensorFlow Lite is that when the accelerator hardware is not available, it relies on the optimized CPU for execution. This ensures that your models run fast on a large set of devices. It also allows a low-latency inference for the on-device ML models. Let’s now have a look at the lightweight architecture: Source: https://www.tensorflow.org/mobile/tflite/ Starting from the top and moving down: A Trained TensorFlow Model, which is saved on disk. A TensorFlow Lite Converter program which converts the Tensorflow model into the TensorFlow Lite format. A TensorFlow Lite Model File format based on FlatBuffers, optimized for maximum speed and minimum size. Further down the architecture, one can see how Tensorflow Lite Model file is deployed onto Android and iOS Applications. Now, within each mobile Application, there is a Java API, a C++ API and an interpreter. Developers also have a choice to implement custom kernels with the C++ API which can be used by the Interpreter. Tensorflow also has support for various models, trained and optimized for the mobile devices. The models are: MobileNet, which is able to identify across 1000 varied object classes. It is designed specifically for an efficient execution on mobile and embedded devices. Inception v3: This is an image recognition model and is similar to MobileNet in functionality. Though large in size, it offers higher accuracy. Smart Reply: An on-device conversational model that provides replies to incoming chat messages with one touch. Many Android wears possess this feature within their messaging apps. Both, Inception v3 and MobileNets are trained using the ImageNet dataset. Using this dataset one can easily retrain the two models on their own image datasets via transfer learning. TensorFlow already has a TensorFlow Mobile API that supports mobile and embedded deployment of models. The obvious question then is, why TensorFlow Lite? Well, team TensorFlow’s answer to this on their official blog post is, “Going forward, TensorFlow Lite should be seen as the evolution of TensorFlow Mobile, and as it matures it will become the recommended solution for deploying models on mobile and embedded devices. With this announcement, TensorFlow Lite is made available as a developer preview, and TensorFlow Mobile is still there to support production apps.” For more information on Tensorflow Lite, you can visit the official documentation page here.

0
0
2571

article-image-trending-datascience-news-15th-nov-17-headlines

Packt Editorial Staff

15 Nov 2017

4 min read

15th Nov.' 17 - Headlines

Packt Editorial Staff

15 Nov 2017

4 min read

Python 2.7 countdown for NumPy, TensorFlow Lite developer preview, Twitter’s premium APIs, and more in today’s trending stories around data science news. NumPy to only support Python 3 in coming years NumPy drops Python 2.7 support from 2020, team announces transition plan As the Python core team is planning to stop supporting Python 2 in 2020, NumPy has decided to eventually drop Python 2.7 support. Citing “increasing burden” on their limited resources, NumPy announced a new plan where until Dec. 31, 2018, all NumPy releases will fully support both Python 2 and Python 3, but effective Jan. 1, 2019, any new feature releases will support only Python 3. The last Python 2 supporting release will be called long term support (LTS) release, and it will be supported by the community until Dec. 31, 2019. “To minimize disruption, running <pip install numpy> on Python 2 will continue to give the last working release in perpetuity, but after Jan. 1, 2019 it may not contain the latest features, and after Jan. 1, 2020 it may not contain the latest bug fixes,” the team said in its announcement. New announcements from Google Google launches developer preview of TensorFlow Lite for mobile machine learning Google has announced the developer preview of TensorFlow Lite, a software library aimed at creating a more lightweight machine learning solution for smartphone and embedded devices. The company is calling it an evolution of TensorFlow for mobile and it’s available now for both Android and iOS app developers. TensorFlow Lite enables low-latency inference of on-device machine learning models, Google said, adding that it’s designed from scratch to be Lightweight, Cross-platform, and Fast. TensorFlow Lite, which was first announced at the I/O developer conference in May, supports the Android Neural Networks API introduced with the Android 8.1 developer preview. Google adds Multi-Region support in Cloud Spanner Google has announced the general availability of Cloud Spanner Multi-Region configurations, extending Cloud Spanner’s transactions and synchronous replication across regions and continents. The new release means that regardless of where the users may be, apps backed by Cloud Spanner can read and write up-to-date data globally with minimal latency. Besides, when running a Multi-Region instance, the database will be able to survive a regional failure, with 10x less downtime. Twitter's subscription access to its data Twitter announces premium APIs, starts with Tweet search at $149/month Twitter has launched a bundle of premium application programming interfaces that will give developers access to more data such as Tweets per request and more complex queries (so far Twitter offered basic query functionality). These premium APIs will serve as a bridge between Twitter's free APIs and enterprise versions. As a first effort, Twitter has launched a public beta of its Search Tweets API which provides access to 30 days of Twitter data. The Search Tweets API will start at $149 a month. Twitter also introduced a new self-serve developer portal that gives the developers more transparent access to their data usage. Announcing Elasticsearch 6.0 Elasticsearch 6.0 released: sequence IDs, circuit breakers, index sorting key improvements Elasticsearch 6.0 has been released. Among the key improvements, sequence IDs will have consensus on the sequence of operations between a primary and a replica shard. This improves the ability to maintain coherency between data, helping to address a gap Elasticsearch has had over the years. Next, the circuit breakers improves detection of requests that end up consuming lots of resources. This way the requests can be isolated without bringing down a cluster. While there are features like index sorting which can significantly boost query time performance, another feature, sparse doc values, changes the way sparsely populated fields are stored, resulting in between 30 percent and 70 percent of savings in storage space. There are other new features spread out across the Elastic stack, comprised of Kibana, Beats and Logstash. These are, respectively, Elasticsearch's solutions for visualization and dashboards, data ingestion and log storage. IBM's new software for AI, machine and deep learning IBM unveils Deep Learning Impact, updates Spectrum LSF Suites and Spectrum Conductor IBM’s new Deep Learning Impact (DLI) software will help users develop AI models using popular open-source deep learning frameworks like Spark, TensorFlow and Caffe. The DLI tools complement the PowerAI deep learning enterprise software distribution, and will be added to IBM's Spectrum Conductor software from December. In addition, the new release of IBM Spectrum LSF Suites will combine powerful workload management and reporting with a new intuitive user interface providing simple and flexible access. Finally, the latest version of IBM Spectrum Scale software provides support to move workloads such as unified file, object and HDFS from where it is stored to where it is analyzed. IBM said these new software offerings could help in production of parallel processing and clustered computing.

0
0
1587

article-image-introducing-googles-tangent

Sugandha Lahoti

14 Nov 2017

3 min read

Introducing Google's Tangent: A Python library with a difference

Sugandha Lahoti

14 Nov 2017

3 min read

The Google Brain team, in a recent blog post, announced the arrival of Tangent, an open source and free Python library for ahead-of-time automatic differentiation. Most machine learning algorithms require the calculation of derivatives and gradients. If we do it manually, it is time-taking as well as error-prone. Automatic differentiation or autodiff is a set of techniques to accurately compute the derivatives of numeric functions expressed as computer programs. Autodiff techniques can run large-scale machine learning models with high-performance and better usability. Tangent uses the Source code transformation (SCT) in Python to perform automatic differentiation. What it basically does is, take the Python source code as input, and then produce new Python functions as its output. The new python function calculates the gradient of the input. This improves readability of the automatic derivative code similar to the rest of the program. In contrast, TensorFlow and Theano, the two most popular machine learning frameworks do not perform autodiff on the Python Code. They instead use Python as a metaprogramming language to define a data flow graph on which SCT is performed. This at times is confusing to the user, considering it involves a separate programming paradigm. Source: https://github.com/google/tangent/blob/master/docs/toolspace.png Tangent has a one-function API: import tangent df = tangent.grad(f) For printing out derivatives: import tangent df = tangent.grad(f, verbose=1) Because it uses SCT, it generates a new python function. This new function follows standard semantics and its source code can be inspected directly. This makes it easy to understand by users, easy to debug, and has no runtime overhead. Another highlighting feature is the fact that it is easily compatible with TensorFlow and NumPy. It is high performing and is built on Python, which has a large and growing community. For processing arrays of numbers, TensorFlow Eager functions are also supported in Tangent. This library also auto-generates derivatives of codes that contain if statements and loops. It also provides easy methods to generate custom gradients. It improves usability by using abstractions for easily inserting logic into the generated gradient code. Tangent provides forward-mode auto differentiation. This is a better alternative than the backpropagation, which fails for cases where the number of outputs exceeds the number of inputs. In contrast, forward-mode auto diff runs in proportion to the input variables. According to the Github repository, “Tangent is useful to researchers and students who not only want to write their models in Python but also read and debug automatically-generated derivative code without sacrificing speed and flexibility.” Currently Tangent does not support classes and closures. Although the developers do plan on incorporating classes. This will enable class definitions of neural networks and parameterized functions. Tangent is still in the experimental stage. In the future, the developers plan to extend it to other numeric libraries and add support for more aspects of the Python language. These include closures, classes, more NumPy and TensorFlow functions etc. They also plan to add more advanced autodiff and compiler functionalities. To summarize, here’s a bullet list of key features of Tangent: Auto differentiation capabilities Code is easy to interpret, debug, and modify Easily compatible Custom Gradients Forward-mode autodiff High performance and optimization You can learn more about the project on their official GitHub.

0
0
3645

article-image-trending-datascience-news-14th-nov-17-headlines

Packt Editorial Staff

14 Nov 2017

5 min read

14th Nov.' 17 - Headlines

Packt Editorial Staff

14 Nov 2017

5 min read

New machine learning language Tile, new HPC systems from Dell EMC and HPE, Microsoft’s Neural Fuzzing, and Amazon’s project Ironman in today’s data science news. Introducing Tile Tile: A new language for machine learning from Vertex.AI Vertex.AI has released a new machine learning language called Tile. It is a tensor manipulation language that is used in PlaidML’s backend to generate custom kernels for each specific operation on each GPU. The automatically produced kernels make it easier to add support of GPUs and new processors, and saves time and effort overall. Tile’s syntax balances expressiveness and optimization to cover the widest range of operations to build neural networks. It closely resembles mathematical notation for describing linear algebra operations, and fully supports automatic differentiation. Vertex.AI said in its official blog that Tile was designed to be parallelizable as well as analyzable. In Tile, it’s possible to analyze issues ranging from cache coherency, use of shared memory, and memory bank conflicts. Dell EMC announces new HPC systems Dell EMC announces high-performance computing bundles aimed at AI, deep learning At the SuperComputing 2017 conference in Denver, Dell EMC introduced a set of high-performance computing (HPC) systems, Dell EMC Ready Bundles for Machine and Deep Learning. These systems, the companies said, intend to bring HPC and data analytics into mainstream thus helping in fraud detection, image processing, financial investment analysis and personalized medicine. The set of services are expected to be available in the first half of 2018. Dell EMC announces new PowerEdge server designed specifically for HPC workloads Dell EMC introduced a new PowerEdge server designed specifically for HPC workloads: Dell EMC PowerEdge C4140 server. As part of a joint development agreement with NVIDIA, this new server supports up to four NVIDIA Tesla V100 GPU accelerators with PCIe and NVLink high-speed interconnect technology. The servers also leverage two Intel Xeon Scalable Processors, and is thus “ideal for intensive machine learning and deep learning applications to drive advances in scientific imaging, oil and gas exploration, financial services and other HPC industry verticals.” The Dell EMC PowerEdge C4140 is expected to be available worldwide in December 2017. Hewlett Packard announces set of upgraded HPC systems for AI HPE Apollo 2000 Gen10 In a bid to make high-performance computing (HPC) and AI more accessible to enterprises, Hewlett Packard Enterprise has announced a set of upgraded high-density compute and storage systems. The HPE Apollo 2000 Gen10 is a multi-server platform for enterprises looking to support HPC and deep learning applications with limited datacenter space. The platform supports Nvidia Tesla V100 GPU accelerators to enable deep learning training and inference for use cases such as real-time video analytics for public safety. Enterprises deploying the HPE Apollo 2000 Gen10 system can start small with a single 2U shared infrastructure and scale out up to 80 HPE ProLiant Gen10 servers in a 42U rack. HPE Apollo 4510 Gen10 The HPE Apollo 4510 Gen10 system is designed for enterprises with data-intensive workloads that are using object storage as an active archive. The system, which has 16 percent more cores than the previous generation, HPE said, and it offers storage capacity of up to 600TB in a 4U form factor with standard server depth. It also supports NVMe cards. HPE Apollo 70 Hewlett Packard Enterprise has announced the launch of HPE Apollo 70, its first ARM-based HPC system using Cavium's 64-bit ARMv8-A ThunderX2 server processor. Set to become available in 2018, the system is designed for memory-intensive HPC workloads, and is compatible with HPC components from HPE's ecosystem partners including Red Hat Enterprise Linux, SUSE Linux Enterprise Server for ARM, and Mellanox InfiniBand and Ethernet fabric solutions. HPE LTO-8 Tape Hewlett Packard announced HPE LTO-8 Tape, which allows enterprises to offload primary storage to tape, with a storage capacity of 30 terabytes per tape cartridge — double that of the previous LTO-7 generation. The HPE LTO-8 Tape is slated for general availability in December 2017. HPE T950 The HPE T950 tape library now stores up to 300 petabytes of data, Hewlett Packard announced. The HPE TFinity ExaScale tape library provides storage capacity for up to 1.6 exabytes of data, the company said. Announcing Microsoft's Neural Fuzzing Neural Fuzzing: Microsoft uses machine learning, deep neural networks for new vulnerability testing Microsoft has announced a new method for discovering software security vulnerabilities, called ‘neural fuzzing.’ The method combines machine learning and deep neural networks to use past experience in order to identify overlooked issues better. The neural fuzzing method takes traditional fuzz testing and adds a machine learning model to insert a deep neural network in the feedback loop of a ‘greybox fuzzer.’ Development Lead William Blum said the neural fuzzing approach is simple because it is not based on sophisticated handcrafted heuristics; instead, it simply learns from an existing fuzzer. He also argued that the new method explores data more quickly than a traditional fuzzer, and that it could be applied to any fuzzer, including blackbox and random fuzzers. “Right now, our model only learns fuzzing locations, but we could also use it to learn other fuzzing parameters such as the type of mutation or strategy to apply,” Blum said. Amazon to launch Ironman Amazon Web Services set to launch AI project Ironman, ease the use of Google’s TensorFlow Amazon Web services could introduce a new service code-named ‘Ironman’ that will make it easier for people to do artificial intelligence work involving lots of different kinds of data, according to a report published in The Information. The Ironman program includes a new AWS cloud “data warehouse” service that collects data from multiple sources within a company and stores it in a central location. Besides, AWS plans to make it easier for people to use TensorFlow. Google made TensorFlow available under an open-source license in 2015, and the library is now widely used among researchers.

0
0
1450

article-image-introducing-tile-language-machine-learning

Sugandha Lahoti

14 Nov 2017

3 min read

Introducing Tile : A new machine learning language with auto generating GPU Kernels

Sugandha Lahoti

14 Nov 2017

3 min read

Recently, Vertex.AI announced a simple and compact machine learning language for its PlaidML framework. Tile is a tensor manipulation language built to bring the PlaidML framework to a wider developer audience. PlaidML is their open source and portable deep learning framework developed for deploying neural networks on any device. A key obstacle the developers of PlaidML faced was scalability. In order for any framework to be adopted across a wide variety of platforms, software support is required. By software support we mean the implementation of software kernels which is a glue between frameworks and the underlying system. Tile comes as a rescue here because it can automatically generate these kernels. This addresses the problem of compatibility by making it easier to add support for different NVIDIA GPUs as well as other new types of processors such as those from AMD and Intel. Tile runs on the backend of PlaidML to produce custom kernels for each specific operation for each GPU. As these kernels are machine generated they are highly accelerated. A high acceleration leads to easily adding support for different processors. Using Tile, machine learning operations can be methodically implemented on parallel computing architectures. It can also be easily converted into optimized GPU kernels. Another key feature of Tile is the fact that the code is very easy to write and understand. This is because coding in Tile is similar to writing a mathematical notation. In addition to this, all machine learning operations expressed in this language can be automatically differentiated. The fact that it is so easy to understand makes it easily adoptable by both machine learning practitioners as well as software engineers and mathematicians. This is an example for writing a Tile matrix multiply : function (A[M, L], B[L, N]) -> (C) { C[i, j: M, N] = +(A[i, k] * B[k, j]); } Note how closely it resembles linear algebra operations with an easy syntax. This syntax is demonstrative as well as optimized for covering all operations required to build neural networks. PlaidML uses Tile as the intermediate language while integration with Keras. This reduces significant writing of backend Keras code. So, it gets easy to support and implement new operations such as dilated convolutions. Tile can also address and analyze issues such as cache coherency, shared memory usage, and memory bank conflicts. According to the official blog of Vertex AI, Tile is characterized by: Control-flow & side-effect free operations on n-dimensional tensors Mathematically oriented syntax resembling tensor calculus N-Dimensional, parametric, composable, and type-agnostic functions Automatic Nth-order differentiation of all operations Suitability for both JITing and pre-compilation Transparent support for resizing, padding & transposition The developers are currently working to bring the language to a formal specification. In the future, they intend to use a similar approach to make TensorFlow, PyTorch, and other frameworks compatible with PlaidML. If you’re interested in learning how to write code in Tile, you can check the Tile tutorial on their GitHub.

0
0
2890

article-image-reinforcement-learning-works

Pravin Dhandre

14 Nov 2017

5 min read

How Reinforcement Learning works

Pravin Dhandre

14 Nov 2017

5 min read

[box type="note" align="" class="" width=""]This article is an excerpt from a book by Rodolfo Bonnin titled Machine Learning for Developers.[/box] Reinforcement learning is a field that has resurfaced recently, and it has become more popular in the fields of control, finding the solutions to games and situational problems, where a number of steps have to be implemented to solve a problem. A formal definition of reinforcement learning is as follows: "Reinforcement learning is the problem faced by an agent that must learn behavior through trial-and-error interactions with a dynamic environment.” (Kaelbling et al. 1996). In order to have a reference frame for the type of problem we want to solve, we will start by going back to a mathematical concept developed in the 1950s, called the Markov decision process. Markov decision process Before explaining reinforcement learning techniques, we will explain the type of problem we will attack with them. When talking about reinforcement learning, we want to optimize the problem of a Markov decision process. It consists of a mathematical model that aids decision making in situations where the outcomes are in part random, and in part under the control of an agent. The main elements of this model are an Agent, an Environment, and a State, as shown in the following diagram: Simpliﬁed scheme of a reinforcement learning process The agent can perform certain actions (such as moving the paddle left or right). These actions can sometimes result in a reward rt, which can be positive or negative (such as an increase or decrease in the score). Actions change the environment and can lead to a new state st+1, where the agent can perform another action at+1. The set of states, actions, and rewards, together with the rules for transitioning from one state to another, make up a Markov decision process. Decision elements To understand the problem, let's situate ourselves in the problem solving environment and look at the main elements: The set of states The action to take is to go from one place to another The reward function is the value represented by the edge The policy is the way to complete the task A discount factor, which determines the importance of future rewards The main difference with traditional forms of supervised and unsupervised learning is the time taken to calculate the reward, which in reinforcement learning is not instantaneous; it comes after a set of steps. Thus, the next state depends on the current state and the decision maker's action, and the state is not dependent on all the previous states (it doesn't have memory), thus it complies with the Markov property. Since this is a Markov decision process, the probability of state st+1 depends only on the current state st and action at: Unrolled reinforcement mechanism The goal of the whole process is to generate a policy P, that maximizes rewards. The training samples are tuples, <s, a, r>. Optimizing the Markov process Reinforcement learning is an iterative interaction between an agent and the environment. The following occurs at each timestep: The process is in a state and the decision-maker may choose any action that is available in that state The process responds at the next timestep by randomly moving into a new state and giving the decision-maker a corresponding reward The probability that the process moves into its new state is influenced by the chosen action in the form of a state transition function Basic RL techniques: Q-learning One of the most well-known reinforcement learning techniques, and the one we will be implementing in our example, is Q-learning. Q-learning can be used to find an optimal action for any given state in a finite Markov decision process. Q-learning tries to maximize the value of the Q-function that represents the maximum discounted future reward when we perform action a in state s. Once we know the Q-function, the optimal action a in state s is the one with the highest Q- value. We can then define a policy π(s), that gives us the optimal action in any state, expressed as follows: We can define the Q-function for a transition point (st, at, rt, st+1) in terms of the Q-function at the next point (st+1, at+1, rt+1, st+2), similar to what we did with the total discounted future reward. This equation is known as the Bellman equation for Q-learning: In practice, we can think of the Q-function as a lookup table (called a Q-table) where the states (denoted by s) are rows and the actions (denoted by a) are columns, and the elements (denoted by Q(s, a)) are the rewards that you get if you are in the state given by the row and take the action given by the column. The best action to take at any state is the one with the highest reward: initialize Q-table Q observe initial state s while (! game_finished): select and perform action a get reward r advance to state s' Q(s, a) = Q(s, a) + α(r + γ max_a' Q(s', a') - Q(s, a)) s = s' You will realize that the algorithm is basically doing stochastic gradient descent on the Bellman equation, backpropagating the reward through the state space (or episode) and averaging over many trials (or epochs). Here, α is the learning rate that determines how much of the difference between the previous Q-value and the discounted new maximum Q- value should be incorporated. We can represent this process with the following flowchart: We have successfully reviewed Q-Learning, one of the most important and innovative architecture of reinforcement learning that have appeared in recent. Every day, such reinforcement models are applied in innovative ways, whether to generate feasible new elements from a selection of previously known classes or even to win against professional players in strategy games. If you enjoyed this excerpt from the book Machine learning for developers, check out the book below.

0
0
5203

article-image-ibm-google-quantum-computing

Abhishek Jha

14 Nov 2017

3 min read

Has IBM edged past Google in the battle for Quantum Supremacy?

Abhishek Jha

14 Nov 2017

3 min read

Last month when researchers at Google unveiled a blueprint for quantum supremacy, little did they know that rival IBM was about to snatch the pole position. In what could be the largest and the most sophisticated quantum computer built till date, IBM has announced the development of a quantum computer capable of handling 50 qubits (quantum bits). The Big Blue also announced another 20-qubit processor that will be made available through IBM Q cloud by the end of the year. "Our 20-qubit machine has double the coherence time, at an average of 90 microseconds, compared to previous generations of quantum processors with an average of 50 microseconds. It is also designed to scale; the 50-qubit prototype has similar performance," Dario Gil, who leads IBM's quantum computing and artificial intelligence research division, said in his blog post. IBM’s progress in this space has been truly rapid. After launching the 5-qubit system in May 2016, they followed with a 15-qubit machine this year, and then upgraded the IBM Q experience to 20-qubits, putting 50-qubits in line. That is quite a leap in 18 months. As a technology, quantum computing is a rather difficult area to understand — information is processed differently here. Unlike normal computers that interpret either a 0 or a 1, quantum computers can live in multiple states, leading to all kinds of programming possibilities for such type of computing. Add to it the coherence factor that makes it very difficult for programmers to build a quantum algorithm. While the company did not divulge the technical details about how its engineers could simultaneously expand the number of qubits and increase the coherence times, it did mention that the improvements were due to better “superconducting qubit design, connectivity and packaging.” That the 50-qubit prototype is a “natural extension” of the 20-qubit technology and both exhibit "similar performance metrics." The major goal though is to create a fault tolerant universal system that is capable of correcting errors automatically while having high coherence. "The holy grail is fault-tolerant universal quantum computing. Today, we are creating approximate universal, meaning it can perform arbitrary operations and programs, but it’s approximating so that I have to live with errors and a limited window of time to perform the operations," Gil said. The good news is that an ecosystem is building up. Through the IBM Q experience, more than 60,000 users have run over 1.7 million quantum experiments and generated over 35 third-party research publications. That the beta-testers included 1,500 universities, 300 high schools and 300 private-sector participants means quantum computing is closer to implementation in real world, in areas like medicine, drug discovery and materials science. "Quantum computing will open up new doors in the fields of chemistry, optimisation, and machine learning in the coming years," Gil added. "We should savor this period in the history of quantum information technology, in which we are truly in the process of rebooting computing." All eyes are now on Google, IBM’s nearest rival in quantum computing at this stage. While IBM’s 50-qubit processor has taken away half the charm out of Google’s soon to be announced 49-qubit system, expect more surprises in the offing as Google has so far managed to keep its entire quantum computing machinery behind closed doors.

0
0
3044

article-image-trending-datascience-news-13th-nov-17-headlines

Packt Editorial Staff

13 Nov 2017

3 min read

13th Nov.' 17 - Headlines

Packt Editorial Staff

13 Nov 2017

3 min read

IBM's 50-qubit machine, a visual analytics tool called SpotLyt, and AllegroGraph Triple Attributes in today’s trending stories in data science news. The largest quantum computer 50Q IBM announces 50-qubit quantum computer IBM has announced a quantum computer that handles 50 quantum bits (qubits). The company said it also has a working prototype of a 20-qubit system that will be made available through IBM Q cloud by year end. IBM did not divulge the technical details about how its engineers could simultaneously expand the number of qubits and increase the coherence times, but it did mention in the official statement that the improvements were due to better “superconducting qubit design, connectivity and packaging.” The 50-qubit prototype, known as 50Q, is a “natural extension” of the 20-qubit technology and exhibits “similar performance metrics,” the company added. The 50-qubit machine is so far the largest and most powerful quantum computer ever built. At this stage, IBM’s nearest rival in quantum computing is Google, which could demonstrate a working 49-qubit system before the end of 2017. Launching SpotLyt Brytlyt announces visual analytics tool SpotLyt for billion row data sets GPU-accelerated database & analytics platform Brytlyt has introduced SpotLyt, a real-time visualization and analytical tool designed for massive datasets. SpotLyt can be used either as a stand-alone visualization tool or as an add-on to a company’s current visualization set-up. “We built SpotLyt because we found existing visualization tools don't handle geo-visualization over 20,000 data points very well,” Brytlyt CEO Richard Heyns said, “Since SpotLyt uses Brytlyt's own data rendering engine to visualize billion row datasets, analysts can now get a holistic and detailed point of view at their fingertips.” AllegroGraph more secure than ever Franz adds Triple Attribute security to AllegroGraph Franz has announced Triple Attributes for its semantic graph database AllegroGraph. The new feature provides the necessary power and flexibility to address high-security data environments such as HIPAA access controls, privacy rules for banks, and security models for policing, intelligence and government. “Enterprises want the flexibility of graph databases, but they also want the security they have come to rely on with relational databases,” Franz CEO Jans Aasman said. Though the Triple Attributes feature was initiated for government level data security, it can be implemented for diverse data analytics from real world events like crop yields to storing blockchain hashes and ICO public keys for KYC applications. Triple Attribute Security is now available in AllegroGraph v6.3.

0
0
1340

article-image-10th-nov-17-data-science-weekly-news

Packt Editorial Staff

13 Nov 2017

2 min read

Week at a Glance (4th - 10th Nov. '17): Top News from Data Science

Packt Editorial Staff

13 Nov 2017

2 min read

Last week saw some interesting partnerships between tech giants, new tool announcements in the conversational AI space, significant version updates and further advancement towards the democratization of AI development and usage. Here is a quick rundown of news in the data science space worth your attention! News Highlights China’s Baidu launches Duer OS Prometheus Project to accelerate conversational AI Frenemies: Intel and AMD partner on laptop chip to keep Nvidia at bay Introducing “Pyro” for deep probabilistic modeling Salesforce myEinstein: Now build AI apps with ‘clicks, not code’ Apache Kafka 1.0: From messaging system to streaming platform Cisco Spark Assistant: World’s first AI voice assistant for meetings In Other News 10th Nov.’ 17 – Headlines AI.io launches PLATO, an AI-based operating platform for enterprise HPE developing its own neural network chip that is faster than anything in the market Atos launches next generation AI servers “BullSequana S” that are ultra-scalable, ultra-flexible 9th Nov.’ 17 – Headlines Bitcoin price surges to record high, then tanks, as plans to split digital currency is called off MongoDB 3.6 released: Change Streams, Retryable Writes among key updates in MongoDB’s biggest ever release Introducing Grid: A scalable Blockchain system for better performance, resource segregation and working governance model 8th Nov.’ 17 – Headlines spaCy 2.0 released with 13 new neural network models for 7+ languages Microsoft says it will extend Hololense AI processor to other devices from daily life Cloud SQL for PostgreSQL integrates high availability and replication Artificial Intelligence creeps into CryptoTrading, AiX claims to develop first AI broker 7th Nov.’ 17 – Headlines Google introduces Tangent, a Python library for automatic differentiation Salesforce, Google form strategic partnership on cloud Rockwell unveils Project Scio, a scalable analytics platform for industrial IoT applications HPE launches Superdome Flex platform for high performance data analytics for mission critical workloads Google releases its internal tool Colaboratory Neuromation announces ICO to facilitate AI adoption with blockchain-powered platform DefinedCrowd unveils data platform API at Web Summit 2017 6th Nov.’ 17 – Headlines Tableau announces support for Amazon Redshift Spectrum in Tableau 10.4 IBM brings new cloud data tools, updates Unified Data Governance Platform IBM’s Goodbye to Bluemix brand Periscope Data unveils new platform to bolster “data driven culture” for professional data teams Caviar announces real estate-backed digital asset platform SIA: MCN collaborates with SAS to unveil single source data platform

0
0
1142

article-image-pint-paper-two-mins-making-neural-network-architectures-generalize-via-recursion

Amarabha Banerjee

13 Nov 2017

3 min read

PINT (Paper IN Two mins) - Making Neural Network Architectures generalize via Recursion

Amarabha Banerjee

13 Nov 2017

3 min read

This is a quick summary of the research paper titled Making Neural Programming Architectures Generalize via Recursion by Jonathon Cai, Richard Shin, Dawn Song published on 6th Nov 2016. The idea of solving a common task is central to developing any algorithm or system. The primary challenge in designing any such system is the problem of generalizing the result to a large set of data. Simply put it means that using the same system, we should be able to predict accurate results when the amount of data is vast and varied across different domains. This is where most ANN systems fail. Researchers have claimed that the process of iteration which is inherent in all algorithms if introduced externally, will help us arrive at a system and architecture that can predict accurate results over limitless amounts of data. This technique is called the Recursive Neural Program. For more on this and the different Neural network programs, you can refer to the original research paper. A sample illustration showing a Neural Network Program is shown below: The Problem with Learned Neural Networks The most common technique which was applied till date to was to use Learned Neural Network - a method where a program was given increasingly complex tasks - for example solving the graduate level addition problem, in simpler words, adding two numbers. The problem with this approach was that the program kept on solving correctly as long as the number of digits was less. When the digits increased, the results were chaotic, some were correct and some were not, the reason being the program chose a complex method to solve the problem of increasing complexity. The real reason behind it was actually the architecture, which stayed the same as the complexity of the problem was increased, hence the program could not adapt in the end and gave chaotic response. The Solution of Recursion The essence of recursion is that it helps the system break down the problem into smaller pieces and then it solves these problems separately. This means irrespective of how complex the problem, the recursive process will break it down into standard units, i.e., the solution remains uniform and consistent. Keeping the theory of recursion in mind, a group of researchers have implemented this in their neural network Program and created a recursive architecture called as the Neural Programmer-Interpreter (NPI). This illustration shows the different algorithms and techniques used to create Neural Network based programs. The present system is based on the May 2016 formulation proposed by Reed et al. The system induces a supervised recursion in solving any task, in a way that a particular function stores an output in a particular memory cell, then calls that output value back while checking the actual desired result. This self-calling of the program or the function automatically induces recursion and that itself helps the program to decompose the problem into multiple smaller units and hence the results are more accurate than other techniques. The scientists have successfully applied this technique to solve four common tasks namely Grade School Addition Bubble Sort Topological Sort Quick Sort They have found that the Recursive Neural Network architecture gives 100 percent success rates in predicting correct results in case of all the four above mentioned tasks. The flip- side of this technique is still the amount of supervision required while performing the tasks. These will be subject to further investigation and research. For a more detailed approach and results on the different neural Network programs and their performance, please refer to the original research paper.

0
0
1628

Amarabha Banerjee

13 Nov 2017

8 min read

Getting started with Storm Components for Real Time Analytics

Amarabha Banerjee

13 Nov 2017

8 min read

[box type="note" align="" class="" width=""]In this article by Shilpi Saxena and Saurabh Gupta from their book Practical Real-time data Processing and Analytics we shall explore Storm's architecture with its components and configure it to run in a cluster. [/box] Initially, real-time processing was implemented by pushing messages into a queue and then reading the messages from it using Python or any other language to process them one by one. The primary challenges with this approach were: In case of failure of the processing of any message, it has to be put back in to queue for reprocessing Keeping queues and the worker (processing unit) up and running all the time Below are the two main reasons that make Storm a highly reliable real-time engine: Abstraction: Storm is distributed abstraction in the form of Streams. A Stream can be produced and processed in parallel. Spout can produce new Stream and Bolt is a small unit of processing on stream. Topology is the top level abstraction. The advantage of abstraction here is that nobody must be worried about what is going on internally, like serialization/deserialization, sending/receiving message between different processes, and so on. The user must be focused on writing the business logic. A guaranteed message processing algorithm: Nathan Marz developed an algorithm based on random numbers and XORs that would only require about 20 bytes to track each spout tuple, regardless of how much processing was triggered downstream. Storm Architecture and Storm components The nimbus node acts as the master node in a Storm cluster. It is responsible for analyzing topology and distributing tasks on different supervisors as per the availability. Also, it monitors failure; in the case that one of the supervisors dies, it then redistributes the tasks among available supervisors. Nimbus node uses Zookeeper to keep track of tasks to maintain the state. In case of Nimbus node failure, it can be restarted which reads the state from Zookeeper and start from the same point where it failed earlier. Supervisors act as slave nodes in the Storm cluster. One or more workers, that is, JVM processes, can run in each supervisor node. A supervisor co-ordinates with workers to complete the tasks assigned by nimbus node. In the case of worker process failure, the supervisor finds available workers to complete the tasks. A worker process is a JVM running in a supervisor node. It has executors. There can be one or more executors in the worker process. Worker co-ordinates with executor to finish up the task. An executor is single thread process spawned by a worker. Each executor is responsible for running one or more tasks. A task is a single unit of work. It performs actual processing on data. It can be either Spout or Bolt. Apart from above processes, there are two important parts of a Storm cluster; they are logging and Storm UI. The logviewer service is used to debug logs for workers at supervisors on Storm UI. The following are the primary characteristics of Storm that make it special and ideal for real-time processing. Fast Reliable Fault-Tolerant Scalable Programming Language Agnostic Strom Components Tuple: It is the basic data structure of Storm. It can hold multiple values and data type of each value can be different. Topology: As mentioned earlier, topology is the highest level of abstraction. It contains the flow of processing including spout and bolts. It is kind of graph computation. Stream: The stream is core abstraction of Storm. It is a sequence of unbounded tuples. A stream can be processed by the different type of bolts and which results into a new stream. Spout: Spout is a source of stream. It reads messages from sources like Kafka, RabbitMQ, and so on as tuples and emits them in a stream. There are two types of Spout Reliable: Spout keeps track of each tuple and replay tuple in case of any failure. Unreliable: Spout does not care about the tuple once it is emitted as a stream to another bolt or spout. Setting up and configuring Storm Before setting up Storm, we need to setup Zookeeper which is required by Storm: Setting up Zookeeper Below are instructions on how to install, configure and run Zookeeper in standalone and cluster mode: Installing Download Zookeeper from http://www-eu.apache.org/dist/zookeeper/zookeeper-3.4.6/zookeeper-3.4.6.tar.gz. After the download, extract zookeeper-3.4.6.tar.gz as below: tar -xvf zookeeper-3.4.6.tar.gz The following files and folders will be extracted: Configuring There are two types of deployment with Zookeeper; they are standalone and cluster. There is no big difference in configuration, just new extra parameters for cluster mode. Standalone As shown, in the previous figure, go to the conf folder and change the zoo.cfg file as follows: tickTime=2000 # Length of single tick in milliseconds. It is used to # regulate heartbeat and timeouts. initLimit=5 # Amount of time to allow followers to connect and sync # with leader. syncLimit=2 # Amount of time to allow followers to sync with # Zookeeper dataDir=/tmp/zookeeper/tmp # Directory where Zookeeper keeps # transaction logs clientPort=2182 # Listening port for client to connect. maxClientCnxns=30 # Maximum limit of client to connect to Zookeeper # node. Cluster In addition to above configuration, add the following configuration to the cluster as well: server.1=zkp-1:2888:3888 server.2=zkp-2:2888:3888 server.3=zkp-3:2888:3888 server.x=[hostname]nnnn:mmmm : Here x is id assigned to each Zookeeper node. In datadir, configured above, create a file "myid" and put corresponding ID of Zookeeper in it. It should be unique across the cluster. The same ID is used as x here. Nnnn is the port used by followers to connect with leader node and mmmm is the port used for leader election. Running Use the following command to run Zookeeper from the Zookeeper home dir: /bin/zkServer.sh start The console will come out after the below message and the process will run in the background. Starting zookeeper ... STARTED The following command can be used to check the status of Zookeeper process: /bin/zkServer.sh status The following output would be in standalone mode: Mode: standalone The following output would be in cluster mode: Mode: follower # in case of follower node Mode: leader # in case of leader node Setting up Apache Storm Below are instructions on how to install, configure and run Storm with nimbus and supervisors. Installing Download Storm from http://www.apache.org/dyn/closer.lua/storm/apache-storm-1.0.3/apache-storm-1.0.3.tar.gz. After the download, extract apache-storm-1.0.3.tar.gz, as follows: tar -xvf apache-storm-1.0.3.tar.gz Below are the files and folders that will be extracted: Configuring As shown, in the previous figure, go to the conf folder and add/edit properties in storm.yaml: Set the Zookeeper hostname in the Storm configuration: storm.zookeeper.servers: - "zkp-1" - "zkp-2" - "zkp-3" Set the Zookeeper port: storm.zookeeper.port: 2182 Set the Nimbus node hostname so that storm supervisor can communicate with it: nimbus.host: "nimbus" Set Storm local data directory to keep small information like conf, jars, and so on: storm.local.dir: "/usr/local/storm/tmp" Set the number of workers that will run on current the supervisor node. It is best practice to use the same number of workers as the number of cores in the machine. supervisor.slots.ports: - 6700 - 6701 - 6702 - 6703 - 6704 - 6705 Perform memory allocation to the worker, supervisor, and nimbus: worker.childopts: "-Xmx1024m" nimbus.childopts: "-XX:+UseConcMarkSweepGC – XX:+UseCMSInitiatingOccupancyOnly – XX_CMSInitiatingOccupancyFraction=70" supervisor.childopts: "-Xmx1024m" Topologies related configuration: The first configuration is to configure the maximum amount of time (in seconds) for a tuple's tree to be acknowledged (fully processed) before it is considered failed. The second configuration is that Debug logs are false, so Storm will generate only info logs. topology.message.timeout.secs: 60 topology.debug: false Running There are four services needed to start a complete Storm cluster: Nimbus: First of all, we need to start Nimbus service in Storm. The following is the command to start it: /bin/storm nimbus Supervisor: Next, we need to start supervisor nodes to connect with the nimbus node. The following is the command: /bin/storm supervisor UI: To start Storm UI, execute the following command: /bin/storm ui You can access UI on http://nimbus-host:8080. It is shown in following figure. Logviewer: Log viewer service helps to see the worker logs in the Storm UI. Execute the following command to start it: /bin/storm logviewer Summary We started with the history of Storm, where we discussed how Nathan Marz the got idea for Storm and what type of challenges he faced while releasing Storm as open source software and then in Apache. We discussed the architecture of Storm and its components. Nimbus, supervisor worker, executors, and tasks are part of Storm's architecture. Its components are tuple, stream, topology, spout, and bolt. We discussed how to set up Storm and configure it to run in the cluster. Zookeeper is required to be set up first, as Storm requires it. The above was an excerpt from the book Practical Real-time data Processing and Analytics

0
0
3480

article-image-baidu-duer-os-prometheus-project-conversational-ai

Abhishek Jha

10 Nov 2017

2 min read

China’s Baidu launches Duer OS Prometheus Project to accelerate conversational AI

Abhishek Jha

10 Nov 2017

2 min read

Experts believe artificial intelligence is the operating system of the future. Earlier this year, when Baidu announced its DuerOS platform, it clearly threw its hat in the ring. Now the Chinese search giant has gone a step ahead to launch a new operating system that has conversational AI capabilities: Duer OS Prometheus Project. “Voice is increasingly becoming how we interact with our devices today,” Kaihua Zhu, chief technology officer of Baidu’s DuerOS, said in a statement. “Open datasets, interdisciplinary collaboration and financial incentives will create the conditions necessary for rapid advancement of conversational AI.” The operating system is already providing conversational support to 10 major domains and over 100 subdomains in China. Since its beta launch at the beginning of 2017, it has quickly gone on to be the preferred choice for third party hardware manufacturers in China for devices ranging from refrigerators and air conditioners to TV set-top boxes, storytelling machines and smart speakers that are seeking Mandarin language voice recognition support. Baidu will gradually open three large scale datasets in far field wake word detection, far field speech recognition, and multi-turn conversations to enable developers to train their algorithms for conversational AI systems. The wake word detection dataset will consist of around 500,000 voice clips of five to ten popular Chinese wake words, including xiaodu xiaodu which is the wake word to activate DuerOS enabled devices. The speech recognition datasets will include thousands of hours of Mandarin speech recognition data to enable people to train systems that can accurately “hear” human speech under complex circumstances such as noisy environments. The project will also release thousands of dialogue data, covering 10 different domains to promote the development of multi-turn conversation technology. To seek broader support for the operating system, Baidu has announced a $1 million fund to invest in efforts related to voice and machine learning. Guoguo Chen, Baidu’s Principal Architect for DuerOS, noted that in the age of AI data should not be a barrier to prevent smaller organizations and individuals from developing leading edge conversational AI systems. “By opening our dataset and offering interdisciplinary collaborations and financial incentives, we hope to accelerate the pace of innovation in this space and advance the future of conversational computing,” Chen said. The DuerOS Prometheus project is sponsored by the Baidu Duer Business Unit, together with Baidu Speech Technology Group, Baidu Campus Branding and Baidu Cloud.

0
0
2583

Tech News - Data

Digitizing the offline: How Alibaba’s FashionAI can revive the waning retail industry

Dr. Brandon explains 'Transfer Learning' to Jon

Spark + H2O = Sparkling water for your machine learning needs

Tensorflow Lite developer preview is Here

15th Nov.' 17 - Headlines

Introducing Google's Tangent: A Python library with a difference

14th Nov.' 17 - Headlines

Introducing Tile : A new machine learning language with auto generating GPU Kernels

How Reinforcement Learning works

Has IBM edged past Google in the battle for Quantum Supremacy?

Trending Topics

13th Nov.' 17 - Headlines

Week at a Glance (4th - 10th Nov. '17): Top News from Data Science

PINT (Paper IN Two mins) - Making Neural Network Architectures generalize via Recursion

Getting started with Storm Components for Real Time Analytics

China’s Baidu launches Duer OS Prometheus Project to accelerate conversational AI