Artificial Intelligence | 0 articles | Tech News, Tutorials & Expert Insights

article-image-paper-two-minutes-novel-method-resource-efficient-image-classification

23 Mar 2018

4 min read

Paper in Two minutes: A novel method for resource efficient image classification

23 Mar 2018

This ICLR 2018 accepted paper, Multi-Scale Dense Networks for Resource Efficient Image Classification, introduces a new model to perform image classification with limited computational resources at test time. This paper is authored by Gao Huang, Danlu Chen, Tianhong Li, Felix Wu, Laurens van der Maaten, and Kilian Weinberger. The 6th annual ICLR conference is scheduled to happen between April 30 - May 03, 2018. Using a multi-scale convolutional neural network for resource efficient image classification What problem is the paper attempting to solve? Recent years have witnessed a surge in demand for applications of visual object recognition, for instance, in self-driving cars and content-based image search. This demand is because of the astonishing progress of convolutional networks (CNNs) where state-of-the-art models may have even surpassed human-level performance. However, most are complex models which have high computational demands at inference time. In real-world applications, computation is never free; it directly translates into power consumption, which should be minimized for environmental and economic reasons. Ideally, all systems should automatically use small networks when test images are easy or computational resources are limited and use big networks when test images are hard or computation is abundant. In order to develop resource-efficient image recognition, the authors aim to develop CNNs that slice the computation and process these slices one-by-one, stopping the evaluation once the CPU time is depleted or the classification sufficiently certain. Unfortunately, CNNs learn the data representation and the classifier jointly, which leads to two problems The features in the last layer are extracted directly to be used by the classifier, whereas earlier features are not. The features in different layers of the network may have a different scale. Typically, the first layers of deep nets operate on a fine scale (to extract low-level features), whereas later layers transition to coarse scales that allow global context to enter the classifier. The authors propose a novel network architecture that addresses both problems through careful design changes, allowing for resource-efficient image classification. Paper summary The model is based on a multi-scale convolutional neural network similar to the neural fabric, but with dense connections and with a classifier at each layer. This novel network architecture, called Multi-Scale DenseNet (MSDNet), address both of the problems described above (of classifiers altering the internal representation and the lack of coarse-scale features in early layers) for resource-efficient image classification. The network uses a cascade of intermediate classifiers throughout the network. The first problem is addressed through the introduction of dense connectivity. By connecting all layers to all classifiers, features are no longer dominated by the most imminent early exit and the trade-off between early or later classification can be performed elegantly as part of the loss function. The second problem is addressed by adopting a multi-scale network structure. At each layer, features of all scales (fine-to-coarse) are produced, which facilitates good classification early on but also extracts low-level features that only become useful after several more layers of processing. Key Takeaways MSDNet, is a novel convolutional network architecture optimized to incorporate CPU budgets at test-time. The design is based on two high-level design principles, to generate and maintain coarse level features throughout the network and to interconnect the layers with dense connectivity. The final network design is a two-dimensional array of horizontal and vertical layers, which decouples depth and feature coarseness. Whereas in traditional convolutional networks features only become coarser with increasing depth, the MSDNet generates features of all resolutions from the first layer on and maintains them throughout. Through experiments, the authors show that their network outperforms all competitive baselines on an impressive range of budgets ranging from highly limited CPU constraints to almost unconstrained settings. Reviewer feedback summary Overall Score: 25/30 Average Score: 8.33 The reviewers found the approach to be natural and effective with good results. They found the presentation to be clear and easy to follow. The structure of the network was clearly justified. The reviewers found the use of dense connectivity to avoid the loss of performance of using early-exit classifier interesting. They appreciated the results and found them to be quite promising, with 5x speed-ups and same or better accuracy than previous models. However, some reviewers pointed out that the results about the more efficient densenet* could be shown in the main paper.

0
0
2466

article-image-amazon-is-supporting-research-into-conversational-ui-with-alexa-fellowships

Sugandha Lahoti

03 Sep 2018

3 min read

Amazon is supporting research into conversational AI with Alexa fellowships

Sugandha Lahoti

03 Sep 2018

3 min read

Amazon has chosen recipients from all over the world to be awarded the Alexa fellowships. The Alexa Fellowships program is open for PhD and post-doctoral students specializing in conversational AI at select universities. The program was launched last year, when four researchers won awards. Amazon's Alexa Graduate fellowship The Alexa Graduate Fellowship supports conversational AI research by providing funds and mentorship to PhD and postdoctoral students. Faculty Advisors and Alexa Graduate Fellows will also teach conversational AI to undergraduate and graduate students using the Alexa Skills Kit (ASK) and Alexa Voice Services (AVS). The graduate fellowship recipients are selected based on their research interests, planned coursework and existing conversational AI curriculum. This year the institutions include six in the United States, two in the United Kingdom, one in Canada and one in India. The 10 universities are: Carnegie Mellon University, Pittsburgh, PA International Institute of Information Technology, Hyderabad, India Johns Hopkins University, Baltimore, MD MIT App Inventor, Boston, MA University of Cambridge, Cambridge, United Kingdom University of Sheffield, Sheffield, United Kingdom University of Southern California, Los Angeles, CA University of Texas at Austin, Austin, TX University of Washington, Seattle, WA University of Waterloo, Waterloo, Ontario, Canada Amazon's Alexa Innovation Fellowship The Alexa Innovation Fellowship is dedicated to innovations in conversational AI. The program was introduced this year and Amazon has partnered with university entrepreneurship centers to help student-led startups build their innovative conversational interfaces. The fellowship also provides resources to faculty members. This year ten leading entrepreneurship center faculty members were selected as the inaugural class of Alexa Innovation Fellows. They are invited to learn from the Alexa team and network with successful Alexa Fund entrepreneurs. Instructors will receive funding, Alexa devices, hardware kits and regular training, as well as introductions to successful Alexa Fund-backed entrepreneurs. The 10 universities selected to receive the 2018-2019 Alexa Innovation Fellowship include: Arizona State University, Tempe, AZ California State University, Northridge, CA Carnegie Mellon University, Pittsburgh, PA Dartmouth College, Hanover, NH Emerson College, Boston, MA Texas A&M University, College Station, TX University of California, Berkeley, CA University of Illinois, Urbana-Champaign, IL University of Michigan, Ann Arbor, MI University of Southern California, Los Angeles, CA “We want to make it easier and more accessible for smart people outside of the company to get involved with conversational AI. That's why we launched the Alexa Skills Kit (ASK) and Alexa Voice Services (AVS) and allocated $200 million to promising startups innovating with voice via the Alexa Fund.” wrote Kevin Crews, Senior Product Manager for the Amazon Alexa Fellowship, in a blog post. Read more about the 2018-2019 Alexa Fellowship class on the Amazon blog. Read next Cortana and Alexa become best friends: Microsoft and Amazon release a preview of this integration Voice, natural language, and conversations: Are they the next web UI?

0
0
2445

article-image-lyft-releases-an-autonomous-driving-dataset-level-5-and-sponsors-research-competition

Amrata Joshi

25 Jul 2019

3 min read

Lyft releases an autonomous driving dataset “Level 5” and sponsors research competition

Amrata Joshi

25 Jul 2019

3 min read

This week, the team at Lyft released a subset of their autonomous driving data, the Level 5 Dataset, and will be sponsoring a research competition. The Level 5 Dataset includes over 55,000 human-labelled 3D annotated frames, a drivable surface map, as well as an HD spatial semantic map for contextualizing the data. The team has been perfecting their hardware and autonomy stack for the last two years. As the sensor hardware needs to be built and properly calibrated, there is also the need for a localization stack and an HD semantic map must be created. Only then it is possible to unlock higher-level functionality like 3D perception, prediction, and planning. The dataset allows a broad cross-section of researchers in contributing to downstream research in self-driving technology. The team is iterating on the third generation of Lyft’s self-driving car and has already patented a new sensor array and a proprietary ultra-high dynamic range (100+DB) camera. Since HD mapping is crucial to autonomous vehicles, the teams in Munich and Palo Alto have been working towards building high-quality lidar-based geometric maps and high-definition semantic maps that are used by the autonomy stack. The team is also working towards building high quality and cost-effective geometric maps that would use only a camera phone for capturing the source data. Lyft’s autonomous platform team has been deploying partner vehicles on the Lyft network. Along with their partner Aptiv, the team has successfully provided over 50,000 self-driving rides to Lyft passengers in Las Vegas, which becomes the largest paid commercial self-driving service in operation. Waymo vehicles are also now available on the Lyft network in Arizona that expands the opportunity for our passengers to experience self-driving rides. To advance self-driving vehicles, the team will also be launching a competition for individuals for training algorithms on the dataset. The dataset makes it possible for researchers to work on problems such as prediction of agents over time, scene depth estimation from cameras with lidar as ground truth and many more. The blog post reads, “We have segmented this dataset into training, validation, and testing sets — we will release the validation and testing sets once the competition opens.” It further reads, “There will be $25,000 in prizes, and we’ll be flying the top researchers to the NeurIPS Conference in December, as well as allowing the winners to interview with our team. Stay tuned for specific details of the competition!” To know more about this news, check out the Medium post. Lyft announces Envoy Mobile, an iOS and Android client network library for mobile application networking Uber and Lyft drivers go on strike a day before Uber IPO roll-out Lyft introduces Amundsen; a data discovery and metadata engine for its researchers and data scientists

0
0
2444

article-image-4-clustering-algorithms-every-data-scientist-know

Sugandha Lahoti

07 Nov 2017

6 min read

4 Clustering Algorithms every Data Scientist should know

Sugandha Lahoti

07 Nov 2017

6 min read

[box type="note" align="" class="" width=""]This is an excerpt from a book by John R. Hubbard titled Java Data Analysis. In this article, we see the four popular clustering algorithms: hierarchical clustering, k-means clustering, k-medoids clustering, and the affinity propagation algorithms along with their pseudo-codes.[/box] A clustering algorithm is one that identifies groups of data points according to their proximity to each other. These algorithms are similar to classification algorithms in that they also partition a dataset into subsets of similar points. But, in classification, we already have data whose classes have been identified. such as sweet fruit. In clustering, we seek to discover the unknown groups themselves. Hierarchical clustering Of the several clustering algorithms that we will examine in this article, hierarchical clustering is probably the simplest. The trade-off is that it works well only with small datasets in Euclidean space. The general setup is that we have a dataset S of m points in Rn which we want to partition into a given number k of clusters C1 , C2 ,..., Ck , where within each cluster the points are relatively close together. Here is the algorithm: Create a singleton cluster for each of the m data points. Repeat m – k times: Find the two clusters whose centroids are closest Replace those two clusters with a new cluster that contains their points The centroid of a cluster is the point whose coordinates are the averages of the corresponding coordinates of the cluster points. For example, the centroid of the cluster C = {(2, 4), (3, 5), (6, 6), (9, 1)} is the point (5, 4), because (2 + 3 + 6 + 9)/4 = 5 and (4 + 5 + 6 + 1)/4 = 4. This is illustrated in the figure below : K-means clustering A popular alternative to hierarchical clustering is the K-means algorithm. It is related to the K-Nearest Neighbor (KNN) classification algorithm. As with hierarchical clustering, the K-means clustering algorithm requires the number of clusters, k, as input. (This version is also called the K-Means++ algorithm) Here is the algorithm: Select k points from the dataset. Create k clusters, each with one of the initial points as its centroid. For each dataset point x that is not already a centroid: Find the centroid y that is closest to x Add x to that centroid’s cluster Re-compute the centroid for that cluster It also requires k points, one for each cluster, to initialize the algorithm. These initial points can be selected at random, or by some a priori method. One approach is to run hierarchical clustering on a small sample taken from the given dataset and then pick the centroids of those resulting clusters. K-medoids clustering The k-medoids clustering algorithm is similar to the k-means algorithm, except that each cluster center, called its medoid, is one of the data points instead of being the mean of its points. The idea is to minimize the average distances from the medoids to points in their clusters. The Manhattan metric is usually used for these distances. Since those averages will be minimal if and only if the distances are, the algorithm is reduced to minimizing the sum of all distances from the points to their medoids. This sum is called the cost of the configuration. Here is the algorithm: Select k points from the dataset to be medoids. Assign each data point to its closest medoid. This defines the k clusters. For each cluster Cj : Compute the sum s = ∑ j s j , where each sj = ∑{ d (x, yj) : x ∈ Cj } , and change the medoid yj to whatever point in the cluster Cj that minimizes s If the medoid yj was changed, re-assign each x to the cluster whose medoid is closest Repeat step 3 until s is minimal. This is illustrated by the simple example in Figure 8.16. It shows 10 data points in 2 clusters. The two medoids are shown as filled points. In the initial configuration it is: C1 = {(1,1),(2,1),(3,2) (4,2),(2,3)}, with y1 = x1 = (1,1) C2 = {(4,3),(5,3),(2,4) (4,4),(3,5)}, with y2 = x10 = (3,5) The sums are s1 = d (x2,y1) + d (x3,y1) + d (x4,y1) + d (x5,y1) = 1 + 3 + 4 + 3 = 11 s2 = d (x6,y1) + d (x7,y1) + d (x8,y1) + d (x9,y1) = 3 + 4 + 2 + 2 = 11 s = s1 + s2 = 11 + 11 = 22 The algorithm at step 3 first part changes the medoid for C1 to y1 = x3 = (3,2). This causes the clusters to change, at step 3 second part, to: C1 = {(1,1),(2,1),(3,2) (4,2),(2,3),(4,3),(5,3)}, with y1 = x3 = (3,2) C2 = {(2,4),(4,4),(3,5)}, with y2 = x10 = (3,5) This makes the sums: s1 = 3 + 2 + 1 + 2 + 2 + 3 = 13 s2 = 2 + 2 = 4 s = s1 + s2 = 13 + 4 = 17 The resulting configuration is shown in the second panel of the figure below: At step 3 of the algorithm, the process repeats for cluster C2. The resulting configuration is shown in the third panel of the above figure. The computations are: C1 = {(1,1),(2,1),(3,2) (4,2),(4,3),(5,3)}, with y1 = x3 = (3,2) C2 = {(2,3),(2,4),(4,4),(3,5)}, with y2 = x8 = (2,4) s = s1 + s2 = (3 + 2 + 1 + 2 + 3) + (1 + 2 + 2) = 11 + 5 = 16 The algorithm continues with two more changes, finally converging to the minimal configuration shown in the fifth panel of the above figure. This version of k-medoid clustering is also called partitioning around medoids (PAM). Affinity propagation clustering One disadvantage of each of the clustering algorithms previously presented (hierarchical, k-means, k-medoids) is the requirement that the number of clusters k be determined in advance. The affinity propagation clustering algorithm does not have that requirement. Developed in 2007 by Brendan J. Frey and Delbert Dueck at the University of Toronto, it has become one of the most widely-used clustering methods. Like k-medoid clustering, affinity propagation selects cluster center points, called exemplars, from the dataset to represent the clusters. This is done by message-passing between the data points. The algorithm works with three two-dimensional arrays: sij = the similarity between xi and xj rik = responsibility: message from xi to xk on how well-suited xk is as an exemplar for xi aik = availability: message from xk to xi on how well-suited xk is as an exemplar for xi Here is the complete algorithm: Initialize the similarities: sij = –d(xi , xj )2 , for i ≠ j; sii = the average of those other sij values 2. Repeat until convergence: Update the responsibilities: rik = sik − max {aij + s ij : j ≠ k} Update the availabilities: aik = min {0, rkk + ∑j { max {0, rjk } : j ≠ i ∧ j ≠ k }}, for i ≠ k; akk = ∑j { max {0, rjk } : j ≠ k } A point xk will be an exemplar for a point xi if aik + rik = maxj {aij + rij}. If you enjoyed this excerpt from the book Java Data Analysis by John R. Hubbard, check out the book to learn how to implement various machine learning algorithms, data visualization and more in Java.

0
0
2325

article-image-data-governance-in-operations-needed-to-ensure-clean-data-for-ai-projects-from-ai-trends

Matthew Emerick

15 Oct 2020

5 min read

Data Governance in Operations Needed to Ensure Clean Data for AI Projects from AI Trends

Matthew Emerick

15 Oct 2020

5 min read

By AI Trends Staff Data governance in data-driven organizations is a set of practices and guidelines that define where responsibility for data quality lives. The guidelines support the operation’s business model, especially if AI and machine learning applications are at work. Data governance is an operations issue, existing between strategy and the daily management of operations, suggests a recent account in the MIT Sloan Management Review. “Data governance should be a bridge that translates a strategic vision acknowledging the importance of data for the organization and codifying it into practices and guidelines that support operations, ensuring that products and services are delivered to customers,” stated author Gregory Vial is an assistant professor of IT at HEC Montréal. To prevent data governance from being limited to a plan that nobody reads, “governing” data needs to be a verb and not a noun phrase as in “data governance.” Vial writes, “The difference is subtle but ties back to placing governance between strategy and operations — because these activities bridge and evolve in step with both.” Gregory Vial, assistant professor of IT at HEC Montréal An overall framework for data governance was proposed by Vijay Khatri and Carol V. Brown in a piece in Communications of the ACM published in 2010. The two suggested the strategy is based on five dimensions that represent a combination of structural, operational and relational mechanisms. The five dimensions are: Principles at the foundation of the framework that relate to the role of data as an asset for the organization; Quality to define the requirements for data to be usable and the mechanisms in place to assess that those requirements are met; Metadata to define the semantics crucial for interpreting and using data — for example, those found in a data catalog that data scientists use to work with large data sets hosted on a data lake. Accessibility to establish the requirements related to gaining access to data, including security requirements and risk mitigation procedures; Life cycle to support the production, retention, and disposal of data on the basis of organization and/or legal requirements. “Governing data is not easy, but it is well worth the effort,” stated Vial. “Not only does it help an organization keep up with the changing legal and ethical landscape of data production and use; it also helps safeguard a precious strategic asset while supporting digital innovation.” Master Data Management Seen as a Path to Clean Data Governance Once the organization commits to data quality, what’s the best way to get there? Naturally entrepreneurs are in position to step forward with suggestions. Some of them are around master data management (MDM), a discipline where business and IT work together to ensure the accuracy and consistency of the enterprise’s master data assets. Organizations starting down the path with AI and machine learning may be tempted to clean the data that feeds a specific application project, a costly approach in the long run suggests one expert. “A better, more sustainable way is to continuously cure the data quality issues by using a capable data management technology. This will result in your training data sets becoming rationalized production data with the same master data foundation,” suggests Bill O’Kane, author of a recent account from tdwi.org on master data management. Formerly an analyst with Gartner, O’Kane is now the VP and MDM strategist at Profisee, a firm offering an MDM solution. If the data feeding into the AI system is not unique, accurate, consistent and time, the models will not produce reliable results and are likely to lead to unwanted business outcomes. These could include different decisions being made on two customer records thought to represent different people, but in fact describe the same person. Or, recommending a product to a customer that was previously returned or generated a complaint. Perceptilabs Tries to Get in the Head of the Machine Learning Scientist Getting inside the head of a machine learning scientist might be helpful in understanding how a highly trained expert builds and trains complex mathematical models. “This is a complex time-consuming process, involving thousands of lines of code,” writes Martin Isaksson, co-founder and CEO of Perceptilabs, in a recent account in VentureBeat. Perceptilabs offers a product to help automation the building of machine learning models, what it calls a “GUI for TensorFlow.”. Martin Isaksson, co-founder and CEO, Perceptilabs “As AI and ML took hold and the experience levels of AI practitioners diversified, efforts to democratize ML materialized into a rich set of open source frameworks like TensorFlow and datasets. Advanced knowledge is still required for many of these offerings, and experts are still relied upon to code end-to-end ML solutions,” Isaksson wrote.. AutoML tools have emerged to help adjust parameters and train machine learning models so that they are deployable. Perceptilabs is adding a visual modeler to the mix. The company designed its tool as a visual API on top of TensorFlow, which it acknowledges as the most popular ML framework. The approach gives developers access to the low-level TensorFlow API and the ability to pull in other Python modules. It also gives users transparency into how the model is architected and a view into how it performs. Read the source articles in the MIT Sloan Management Review, Communications of the ACM, tdwi.org and VentureBeat.

0
0
2286

article-image-dr-brandon-explains-word-vectors-word2vec-jon

Aarthi Kumaraswamy

01 Nov 2017

6 min read

Dr. Brandon explains Word Vectors (word2vec) to Jon

Aarthi Kumaraswamy

01 Nov 2017

6 min read

[box type="shadow" align="" class="" width=""]Dr. Brandon: Welcome back to the second episode of 'Date with Data Science'. Last time, we explored natural language processing. Today we talk about one of the most used approaches in NLP: Word Vectors. Jon: Hold on Brandon, when we went over maths 101, didn't you say numbers become vectors when they have a weight and direction attached to them. But numbers and words are Apples and Oranges! I don't understand how words could also become vectors. Unless the words are coming from my movie director and he is yelling at me :) ... What would the point of words having directions be, anyway? Dr. Brandon: Excellent question to kick off today's topic, Jon. On an unrelated note, I am sure your director has his reasons. The following is an excerpt from the book Mastering Machine Learning with Spark 2.x by Alex Tellez, Max Pumperla and Michal Malohlava. [/box] Traditional NLP approaches rely on converting individual words--which we created via tokenization--into a format that a computer algorithm can learn (that is, predicting the movie sentiment). Doing this required us to convert a single review of N tokens into a fixed representation by creating a TF-IDF matrix. In doing so, we did two important things behind the scenes: Individual words were assigned an integer ID (for example, a hash). For example, the word friend might be assigned to 39,584, while the word bestie might be assigned to 99,928,472. Cognitively, we know that friend is very similar to bestie; however, any notion of similarity is lost by converting these tokens into integer IDs. By converting each token into an integer ID, we consequently lose the context with which the token was used. This is important because, in order to understand the cognitive meaning of words, and thereby train a computer to learn that friend and bestie are similar, we need to understand how the two tokens are used (for example, their respective contexts). Given this limited functionality of traditional NLP techniques with respect to encoding the semantic and syntactic meaning of words, Tomas Mikolov and other researchers explored methods that employ neural networks to better encode the meaning of words as a vector of N numbers (for example, vector bestie = [0.574, 0.821, 0.756, ... , 0.156]). When calculated properly, we will discover that the vectors for bestie and friend are close in space, whereby closeness is defined as a cosine similarity. It turns out that these vector representations (often referred to as word embeddings) give us the ability to capture a richer understanding of text. Interestingly, using word embeddings also gives us the ability to learn the same semantics across multiple languages despite differences in the written form (for example, Japanese and English). For example, the Japanese word for movie is eiga; therefore, it follows that using word vectors, these two words, should be close in the vector space despite their differences in appearance. Thus, the word embeddings allow for applications to be language-agnostic--yet another reason why this technology is hugely popular! Word2vec explained First things first: word2vec does not represent a single algorithm but rather a family of algorithms that attempt to encode the semantic and syntactic meaning of words as a vector of N numbers (hence, word-to-vector = word2vec). We will explore each of these algorithms in depth in this chapter, while also giving you the opportunity to read/research other areas of vectorization of text, which you may find helpful. What is a word vector? In its simplest form, a word vector is merely a one-hot-encoding, whereby every element in the vector represents a word in our vocabulary, and the given word is encoded with 1 while all the other words elements are encoded with 0. Suppose our vocabulary only has the following movie terms: Popcorn, Candy, Soda, Tickets, and Blockbuster. Following the logic we just explained, we could encode the term Tickets as follows: Using this simplistic form of encoding, which is what we do when we create a bag-of-words matrix, there is no meaningful comparison we can make between words (for example, is Popcorn related to Soda; is Candy similar to Tickets?). Given these obvious limitations, word2vec attempts to remedy this via distributed representations for words. Suppose that for each word, we have a distributed vector of, say, 300 numbers that represent a single word, whereby each word in our vocabulary is also represented by a distribution of weights across those 300 elements. Now, our picture would drastically change to look something like this: Now, given this distributed representation of individual words as 300 numeric values, we can make meaningful comparisons among words using a cosine similarity, for example. That is, using the vectors for Tickets and Soda, we can determine that the two terms are not related, given their vector representations and their cosine similarity to one another. And that's not all we can do! In their ground-breaking paper, Mikolov et. al also performed mathematical functions of word vectors to make some incredible findings; in particular, the authors give the following math problem to their word2vec dictionary: V(King) - V(Man) + V(Woman) ~ V(Queen) It turns out that these distributed vector representations of words are extremely powerful in comparison questions (for example, is A related to B?), which is all the more remarkable when you consider that this semantic and syntactic learned knowledge comes from observing lots of words and their context with no other information necessary. That is, we did not have to tell our machine that Popcorn is a food, noun, singular, and so on. How is this made possible? Word2vec employs the power of neural networks in a supervised fashion to learn the vector representation of words (which is an unsupervised task). The above is an excerpt from the book Mastering Machine Learning with Spark 2.x by Alex Tellez, Max Pumperla and Michal Malohlava. To learn more about the word2vec and doc2vec algorithms such as continuous-bag-of-words (CBOW), skip-gram, cosine similarity, distributed memory among other models and to build applications based on these, check out the book.

0
0
2270

article-image-ai-chipmaking-startup-graphcore-raises-200m-from-bmw-microsoft-bosch-dell

Melisha Dsouza

18 Dec 2018

2 min read

AI chipmaking startup ‘Graphcore’ raises $200m from BMW, Microsoft, Bosch, Dell

Melisha Dsouza

18 Dec 2018

2 min read

Today, Graphcore, a UK-based chipmaking startup has raised $200m in a series D funding round from investors including Microsoft and BMW, valuing the company at $1.7bn. This new funding brings the total capital raised by Graphcore to date to more than $300m. The funding round was led by U.K.venture capital firm Atomico and Sofina, with participation from the biggest names in the AI and machine learning industry like Merian Global Investors, BMW iVentures, Microsoft, Amadeus Capital Partners, Robert Bosch Venture Capital, Dell Technologies Capital, amongst many others. The company intends to use the funds generated to execute on its product roadmap, accelerate scaling and expand its global presence. Graphcore, which designs chips purpose-built for artificial intelligence, is attempting to create a new class of chips that are better able to deal with the huge amounts of data needed to make AI computers. The company is ramping up production to meet customer demand for its Intelligence Processor Unit (UPU) PCIe processor cards, the first to be designed specifically for machine intelligence training and inference. Mr. Nigel Toon, CEO, and co-founder, Graphcore said that Graphcore’s processing units can be used for both the training and deployment of machine learning systems, and they were “much more efficient”. Tobias Jahn, principal at BMW i Ventures stated that Graphcore’s technology "is well-suited for a wide variety of applications from intelligent voice assistants to self-driving vehicles.” Last year the company raised $50 million from investors including Demis Hassabis, co-founder of DeepMind; Zoubin Ghahramani of Cambridge University and chief scientist at Uber, Pieter Abbeel from UC Berkeley, and Greg Brockman, Scott Grey and Ilya Sutskever, from OpenAI. Head over to Graphcore’s official blog for more insights on this news. Microsoft Azure reportedly chooses Xilinx chips over Intel Altera for AI co-processors, says Bloomberg report NVIDIA makes its new “brain for autonomous AI machines”, Jetson AGX Xavier Module, available for purchase NVIDIA demos a style-based generative adversarial network that can generate extremely realistic images; has ML community enthralled

0
0
2204

article-image-dr-brandon-explains-transfer-learning

Shoaib Dabir

15 Nov 2017

5 min read

Dr. Brandon explains 'Transfer Learning' to Jon

Shoaib Dabir

15 Nov 2017

5 min read

[box type="shadow" align="" class="" width=""]Dr. Brandon: Hello and welcome to another episode of 'Date with Data Science'. Today we are going to talk about a topic that is all the rage these days in the data science community: Transfer Learning. Jon: 'Transfer learning' sounds all sci-fi to me. Is it like the thing that Prof. X does in X-men reading other people's minds using that dome-like headset thing in his chamber? Dr. Brandon: If we are going to get X-men involved, what Prof. X does is closer to deep learning. We will talk about that another time. Transfer learning is simpler to explain. It's what you actually do everytime you get into some character, Jon. Say, you are given the role of Jack Sparrow to play. You will probably read a lot about pirates, watch a lot of pirate movies and even Jonny Depp in character and form your own version of Jack Sparrow. Now after that acting assignment is over, say you are given the opportunity to audition for the role of Captain Hook, the famous pirate from Peter Pan. You won't do your research from ground zero this time. You will retain general mannerisms of a Pirate you learned from your previous role, but will only learn the nuances of Captain Hook, like acting one-handed. Jon: That's pretty cool! So you say machines can also learn this way? Dr.Brandon: Of course, that's what transfer learning is all about: learn something, abstract the learning sufficiently, then apply it to another related problem. The following is an excerpt from a book by Kuntal Ganguly titled Learning Generative Adversarial Networks.[/box] Pre-trained models are not optimized for tackling user specific datasets, but they are extremely useful for the task at hand that has similarity with the trained model task. For example, a popular model, InceptionV3, is optimized for classifying images on a broad set of 1000 categories, but our domain might be to classify some dog breeds. A well-known technique used in deep learning that adapts an existing trained model for a similar task to the task at hand is known as Transfer Learning. And this is why Transfer Learning has gained a lot of popularity among deep learning practitioners and in recent years has become the go-to technique in many real-life use cases. It is all about transferring knowledge (or features) among related domain. Purpose of Transfer Learning Let say you have trained a deep neural network to differentiate between fresh mango and rotten mango. During training, the network requires thousands of rotten and fresh mango images and hours of training to learn knowledge like if any fruit is rotten, a liquid will ooze out of the fruit and it produce a bad odor. Now with this training experience the network, can be used for different task/use-case to differentiate between a rotten apple and fresh apple using the knowledge of rotten features learned during training of mango images. The general approach of Transfer Learning is to train a base network and then copy its first n layers to the first n layers of a target network. The remaining layers of the target network are initialized randomly and trained toward the targeted use-case. The main scenarios for using Transfer Learning in your deep learning workflow are as follows: Smaller datasets: When you have a smaller dataset, building a deep learning model from scratch won't work well. Transfer Learning provides the way to apply a pre-trained model to new classes of data. Let's say a pre-trained model built from one million images of ImageNet data will converge to a decent solution (after training on just a fraction of the available smaller training data, for example, CIFAR-10) compared to a deep learning model built with a smaller dataset from scratch. Less resource: Deep learning process (such as convolution) requires a significant amount of resource and time. Deep learning process are well suited to run on high graded GPU-based machines. But with pre-trained models, you can easily train across a full training set (let's say 50000 images) in less than a minute using your laptop/notebook without GPU, since the majority of time a model is modified in the final layer with a simple update of just a classifier or regressor. Various approaches of using pre-trained models Using pre-trained architecture: Instead of transferring weights of the trained model, we can only use the architecture and initialize our own random weights to our new dataset. Feature extractor: A pre-trained model can be used as a feature extraction mechanism just by simply removing the output layer of the network (that gives the probabilities for being in each of the n classes) and then freezing all the previous layers of the network as a fixed feature extractor for the new dataset. Partially freezing the network: Instead of replacing only the final layer and extracting features from all previous layers, sometime we might train our new model partially (that is, to keep the weights of initial layers of the network frozen while retraining only the higher layers). Choice of the number of frozen layers can be considered as one more hyper-parameter. Next, read about how transfer learning is being used in the real world. If you enjoyed the above excerpt, do check out the book it is from.

0
0
2202

article-image-amazon-tried-to-sell-its-facial-recognition-technology-to-ice-in-june-emails-reveal

Richard Gall

24 Oct 2018

3 min read

Amazon tried to sell its facial recognition technology to ICE in June, emails reveal

Richard Gall

24 Oct 2018

3 min read

It has emerged that Amazon representatives met with Immigrations and Customs Enforcement (ICE) this Summer in a bid to sell its facial recognition tool Rekognition. Emails obtained by The Daily Beast show that officials from Amazon met with ICE on June 12 in Redwood City. In that meeting, Amazon outlined some of AWS capabilities, stating that "we are ready and willing to help support the vital HSI [Homeland Security Investigations] mission." The emails (which you can see for yourself here) also show that Amazon were keen to set up a "workshop" with U.S. Homeland Security, and "a meeting to review the process in more depth and help assess your target list of 'Challenges [capitalization intended]'." What these 'Challenges' are referring to exactly is unclear. The controversy around Amazon's Rekognition tool These emails will only serve to increase the controversy around Rekognition and Amazon's broader involvement with security services. Earlier this year the ACLU (American Civil Liberties Union) revealed that a small number of law enforcement agencies were using Rekognition for various purposes. Later, in July, the ACLU published the results of its own experiment with Rekognition in which it incorrectly matched mugshots with 28 Congress members. Amazon responded to this research with a rebuttal on the AWS blog. In it, the Dr. Matt Wood stated that "machine learning is a very valuable tool to help law enforcement agencies, and while being concerned it’s applied correctly, we should not throw away the oven because the temperature could be set wrong and burn the pizza." This post was referenced in the email correspondence between Amazon and ICE. Clearly, the issue of accuracy was an issue in the company's discussion with security officials. The controversy continued this month after an employee published an anonymous letter on Medium, urging the company not to sell Rekognition to police. They wrote: "When a company puts new technologies into the world, it has a responsibility to think about the consequences." Amazon claims Rekognition isn't a surveillance service We covered this story on the Packt Hub last week. Following publication, an Amazon PR representative contacted us, stating that "Amazon Rekognition is NOT a surveillance service" [emphasis the writer's, not mine]. The representative also cited the post mentioned above by Dr. Matt Wood, keen to tackle some of the challenges presented by the ACLU research. Although Amazon's position is clear, it will be difficult for the organization to maintain that line given these emails. Separating the technology from its deployment is all well and good until its clear that you're courting the kind of deployment for which you are being criticised. Note 10.30.2018 - Amazon spokesperson responded with a comment, wishing to clarify the events described from its perspective: “We participated with a number of other technology companies in technology “boot camps” sponsored by McKinsey Company, where a number of technologies were discussed, including Rekognition. As we usually do, we followed up with customers who were interested in learning more about how to use our services (Immigration and Customs Enforcement was one of those organizations where there was follow-up discussion).”

0
0
2168

article-image-automobile-repair-self-diagnosis-and-traffic-light-management-enabled-by-ai-from-ai-trends

Matthew Emerick

15 Oct 2020

5 min read

Automobile Repair Self-Diagnosis and Traffic Light Management Enabled by AI from AI Trends

Matthew Emerick

15 Oct 2020

5 min read

By AI Trends Staff Looking inside and outside, AI is being applied to the self-diagnosis of automobiles and to the connection of vehicles to traffic infrastructure. A data scientist at BMW Group in Munich, while working on his PhD, created a system for self-diagnosis called the Automated Damage Assessment Service, according to an account in Mirage. Milan Koch was completing his studies at the Leiden Institute of Advanced Computer Science in the Netherlands when he got the idea. “It should be a nice experience for customers,” he stated. The system gathers data over time from sensors in different parts of the car. “From scratch, we have developed a service idea that is about detecting damaged parts from low speed accidents,” Koch stated. “The car itself is able to detect the parts that are broken and can estimate the costs and the time of the repair.” Milan Koch, data scientist, BMW Group, Munich Koch developed and compared different multivariate time series methods, based on machine learning, deep learning and also state-of-the-art automated machine learning (AutoML) models. He tested different levels of complexity to find the best way to solve the time series problems. Two of the AutoML methods and his hand-crafted machine learning pipeline showed the best results. The system may have application to other multivariate time series problems, where multiple time-dependent variables must be considered, outside the automotive field. Koch collaborated with researchers from the Leiden University Medical Center (LUMC) to use his hand-crafted pipeline to analyze Electroencephalography (EEG) data. Koch stated, ‘We predicted the cognition of patients based on EEG data, because an accurate assessment of cognitive function is required during the screening process for Deep Brain Stimulation (DBS) surgery. Patients with advanced cognitive deterioration are considered suboptimal candidates for DBS as cognitive function may deteriorate after surgery. However, cognitive function is sometimes difficult to assess accurately, and analysis of EEG patterns may provide additional biomarkers. Our machine learning pipeline was well suited to apply to this problem.” He added, “We developed algorithms for the automotive domain and initially we didn’t have the intention to apply it to the medical domain, but it worked out really well.” His models are now also applied to Electromyography (EMG) data, to distinguish between people with a motor disease and healthy people. Koch intends to continue his work at BMW Group, where he will focus on customer-oriented services, predictive maintenance applications and optimization of vehicle diagnostics. DOE Grant to Research Traffic Management Delays Aims to Reduce Emissions Getting automobiles to talk to the traffic management infrastructure is the goal of research at the University of Tennesse at Chattanooga, which has been awarded $1.89 million from the US Department of Energy to create a new model for traffic intersections that would reduce energy consumption. The UTC Center for Urban Informatics and Progress (CUIP) will leverage its existing “smart corridor” to accommodate the new research. The smart corridor is a 1.25-mile span on a main artery in downtown Chattanooga, used as a test bed for research into smart city development and connected vehicles in a real-world environment. “This project is a huge opportunity for us,” stated Dr. Mina Sartipi, CUIP Director and principal investigator, in a press release. “Collaborating on a project that is future-oriented, novel, and full of potential is exciting. This work will contribute to the existing body of literature and lead the way for future research.” UTC is collaborating with the University of Pittsburgh, the Georgia Institute of Technology, the Oak Ridge National Laboratory, and the City of Chattanooga on the project. Dr. Mina Sartipi, Director, UTC Center for Urban Informatics and Progress In the grant proposal for the DOE, the research team noted that the US transportation sector accounted for more than 69 percent of petroleum consumption, and more than 37 percent of the country’s CO2 emissions. An earlier National Traffic Signal Report Card found that inefficient traffic signals contribute to 295 million vehicle hours of traffic delay, making up to 10 percent of all traffic-related delays. The project intends to leverage the capabilities of connected vehicles and infrastructures to optimize and manage traffic flow. While adaptive traffic control systems (ATCS) have been in use for a half century to improve mobility and traffic efficiency, they were not designed to address fuel consumption and emissions. Inefficient traffic systems increase idling time and stop-and-go traffic. The National Transportation Operations Coalition has graded the state of the nation’s traffic signals as D+. “The next step in the evolution [of intelligent transportation systems] is the merging of these systems through AI,” noted Aleksandar Stevanovic, associate professor of civil and environmental engineering at Pitt’s Swanson School of Engineering and director of the Pittsburgh Intelligent Transportation Systems (PITTS) Lab. “Creation of such a system, especially for dense urban corridors and sprawling exurbs, can greatly improve energy and sustainability impacts. This is critical as our transportation portfolio will continue to have a heavy reliance on gasoline-powered vehicles for some time.” The goal of the three-year project is to develop a dynamic feedback Ecological Automotive Traffic Control System (Eco-ATCS), which reduces fuel consumption and greenhouse gases while maintaining a highly operable and safe transportation environment. The integration of AI will allow additional infrastructure enhancements including emergency vehicle preemption, transit signal priority, and pedestrian safety. The ultimate goal is to reduce corridor-level fuel consumption by 20 percent. Read the source articles and information in Mirage, and in a press release from the UTC Center for Urban Informatics and Progress.

0
0
2164

article-image-alibaba-fashionai-artificial-intelligence-shopping

Abhishek Jha

15 Nov 2017

3 min read

Digitizing the offline: How Alibaba’s FashionAI can revive the waning retail industry

Abhishek Jha

15 Nov 2017

3 min read

Imagine a visit to a store when you end up forgetting what you wanted to purchase, but the screen at front tells you all about your preferences. Before you struggle to pick a selection, the system tells you what you could possibly try and which items you could end up purchasing. All this is happening in the malls of China, thanks to artificial intelligence. Thanks to the Alibaba Group which is on a mission to revive the offline retail market. Ever since inception in 2009, the Singles Day festival has been considered the biggest opportunity for shopping in China. There was no better occasion for Alibaba to test its artificial intelligence product FashionAI. And this year, the sales zoomed. Alibaba’s gross merchandise volume gained a mammoth 25 billion dollars on Nov. 11, breaking its last year’s gross figure of $17.8 billion by a considerable margin. Majorly attributed to the FashionAI initiative. At a time when offline retail is in decline all across the world, Alibaba’s Fashion AI could single-handedly reinvent the market. It will drag you back to the malls. When the tiny sensors embedded in the cloth you just tried suggest you all ‘matching’ items, you do not mind visiting the stores with an easy to use recognizable interface at front that saves you from all the old drudgeries in retail shopping! The FashionAI screen interface uses machine learning for its suggestions, based on the items being tried on. It extends the information stored into the product tag to generate recommendations. Using the system, a customer can try clothes on, get related ‘smart’ suggestions from the AI, and then finalize a selection on the screen. But most importantly, the AI assistant doesn’t intend to replace humans with robots. It instead integrates them together to deliver better service. So when you want to try something different, you can click a button and a store attendant will be right there. Why are we cutting down on human intervention then? The point is, it is nearly impossible for a human staff to remember the shopping preferences of each customer, whereas an artificial intelligence system can do it with scale. This is why researchers thought to apply deep learning for real world scenarios like these. Unlike the human store attendant who gets irked with your massive shopping tantrums in terms of choices, the AI system has been programmed to rather leverage the big data for making ‘smart’ decisions. What more, it gets better with time, and learns more and more based on the suggested inputs from surroundings. We could say FashionAI is still an experiment. But Alibaba is on course to create history, if the Chinese e-commerce giant succedes in fueling life back into retail. As CEO Daniel Zhang reiterated, Alibaba is going to “digitize the offline retail world.” This is quite a first for such a line of thinking. But then, customers don’t recognize offline and online till it serves their interest.

0
0
2140

Abhishek Jha

01 Nov 2017

3 min read

Sony resurrects robotic pet Aibo with advanced AI

Abhishek Jha

01 Nov 2017

3 min read

A decade back when CEO Howard Stringer decided to discontinue Sony’s iconic entertainment robot AIBO, its progenitor Toshitada Doi had famously staged a mock funeral lamenting, more than Aibo’s disbandment, the death of Sony’s risk-taking spirit. Today as the Japanese firm’s sales have soared to a decade high beating projected estimates, Aibo is back from the dead. The revamped pet looks cuter than ever before, after nearly a decade of hold. And it has been infused with a range of sensors, cameras, microphones and upgraded artificial intelligence features. The new Aibo is an ivory-white, plastic-covered hound which even has the ability to connect to mobile networks. Using actuators, it can move its body remarkably well, while using two OLED panels in eyes to exhibit an array of expressions. Most importantly, it comes with a unique ‘adaptive’ behavior that includes being able to actively recognize its owner and running over to them, learning and interacting in the process – detecting smiles and words of praises – with all those head and back scratches. In short, a dog in real without canine instincts. Priced at around $1,735 (198,000 Yen), Aibo includes a SIM card slot to connect to internet and access Sony’s AI cloud to analyze and learn how other robot dogs are behaving on the network. Sony says it does not intend to replace a digital assistant like Google Home but that Aibo could be a wonderful companion for children and families, forming an “emotional bond” with love, affection, and joy. The cloud service that powers Aibo’s AI is however expensive, and a basic three-year subscription plan is priced at $26 (2,980 Yen) per month. Or you could sign up upfront for three years at around $790 (90,000 Yen). As far as the battery life is concerned, the robot will take three hours to fully charge itself once it gets dissipated after two hours of activity. “It was a difficult decision to stop the project in 2006, but we continued development in AI and robotics,” Sony CEO Kazuo Hirai said speaking at a launch event. “I asked our engineers a year and a half ago to develop Aibo because I strongly believe robots capable of building loving relationships with people help realize Sony’s mission.” When Sony had initially launched AIBO in 1999, it was well ahead of its time. But after the initial euphoria, the product somehow failed to get mainstream buyers as reboots after reboots failed to generate profits. That time clearly Sony had to make a decision as its core electronics business struggled in price wars. Today, times are different – AI fever has gripped the tech world. A plastic bone (‘aibone’) for the robotic dog costs you around 2,980 Yen. And that’s the price you pay for a keeping a robotic buddy around. The word “aibo” literally means a companion after all.

0
0
2120

article-image-youtube-promises-to-reduce-recommendations-of-conspiracy-theory-ex-googler-explains-why-this-is-a-historic-victory

Sugandha Lahoti

12 Feb 2019

4 min read

Youtube promises to reduce recommendations of ‘conspiracy theory’. Ex-googler explains why this is a 'historic victory'

Sugandha Lahoti

12 Feb 2019

4 min read

Talks of AI algorithms causing harms including addiction, radicalization. political abuse and conspiracies, disgusting kids videos and the danger of AI propaganda are all around. Last month, YouTube announced an update regarding YouTube recommendations aiming to reduce the recommendations of videos that promote misinformation ( eg: conspiracy videos, false claims about historical events, flat earth videos, etc). In a historical move, Youtube changed its Artificial Intelligence algorithm instead of favoring another solution, which may have cost them fewer resources, time, and money. Last Friday, an ex-googler who helped build the YouTube algorithm, Guillaume Chaslot, appreciated this change in AI, calling it “a great victory” which will help thousands of viewers from falling down the rabbit hole of misinformation and false conspiracy theories. In a twitter thread, he presented his views as someone who has had experience working on Youtube’s AI. Recently, there has been a trend in Youtube promoting conspiracy videos such as ‘Flat Earth theories’. In a blog post, Guillaume Chaslot explains, “Flat Earth is not a ’small bug’. It reveals that there is a structural problem in Google’s AIs and they exploit weaknesses of the most vulnerable people, to make them believe the darnedest things.” Youtube realized this problem and has made amends to its algorithm. “It’s just another step in an ongoing process, but it reflects our commitment and sense of responsibility to improve the recommendations experience on YouTube. To be clear, this will only affect recommendations of what videos to watch, not whether a video is available on YouTube. As always, people can still access all videos that comply with our Community Guidelines”, states the YouTube team in a blog post. Chaslot appreciated this fact in his twitter thread saying that although Youtube had the option to ‘make people spend more time on round earth videos’, they chose the hard way by tweaking their AI algorithm. AI algorithms also often get biased by tiny groups of hyperactive users. As Chaslot notes, people who spend their lives on YouTube affect recommendations more. The content they watch gets more views, which leads to Youtubers noticing and creating more of it, making people spend even more time on that content. This is because YouTube optimizes for things you might watch, not things you might like. As a hacker news user observed, “The problem was that pathological/excessive users were overly skewing the recommendations algorithms. These users tend to watch things that might be unhealthy in various ways, which then tend to get over-promoted and lead to the creation of more content in that vein. Not a good cycle to encourage.” The new change in Youtube’s AI makes use of machine learning along with human evaluators and experts from all over the United States to train these machine learning systems responsible for generating recommendations. Evaluators are trained using public guidelines and offer their input on the quality of a video. Currently, the change is applied only to a small set of videos in the US as the machine learning systems are not very accurate currently. The new update will roll out in different countries once the systems become more efficient. However, there is another problem lurking around which is probably even bigger than conspiracy videos. This is the addiction to spending more and more time online. AI engines used in major social platforms, including but not limited to YouTube, Netflix, Facebook all want people to spend as much time as possible. A hacker news user commented, “This is just addiction peddling. Nothing more. I think we have no idea how much damage this is doing to us. It’s as if someone invented cocaine for the first time and we have no social norms or legal framework to confront it.” Nevertheless, Youtube updating it’s AI engine was taken generally positively by Netizens. As Chaslot, concluded on his Twitter thread, “YouTube's announcement is a great victory which will save thousands. It's only the beginning of a more humane technology. Technology that empowers all of us, instead of deceiving the most vulnerable.” Now it is on Youtube’s part how they will strike a balance between maintaining a platform for free speech and living up to their responsibility to users. Is the YouTube algorithm’s promoting of #AlternativeFacts like Flat Earth having a real-world impact? YouTube to reduce recommendations of ‘conspiracy theory’ videos that misinform users in the US. YouTube bans dangerous pranks and challenges Is YouTube’s AI Algorithm evil?

0
0
2092

article-image-dr-brandon-explains-nlp-natural-language-processing-jon

Aarthi Kumaraswamy

25 Oct 2017

5 min read

Dr.Brandon explains NLP (Natural Language Processing) to Jon

Aarthi Kumaraswamy

25 Oct 2017

5 min read

[box type="shadow" align="" class="" width=""] Dr.Brandon: Welcome everyone to the first episode of 'Date with data science'. I am Dr. Brandon Hopper, B.S., M.S., Ph.D., Senior Data Scientist at BeingHumanoid and, visiting faculty at Fictional AI University. Jon: And I am just Jon - actor, foodie and Brandon's fun friend. I don't have any letters after my name but I can say the alphabets in reverse order. Pretty cool, huh! Dr.Brandon: Yes, I am sure our readers will find it very amusing Jon. Talking of alphabets, today we discuss NLP. Jon: Wait, what is NLP? Is it that thing Ashley's working on? Dr.Brandon: No. The NLP we are talking about today is Natural Language Processing, not to be confused with Neuro-Linguistic Programming. Jon: Oh alright. I thought we just processed cheese. How do you process language? Don't you start with 'to understand NLP, we must first understand how humans started communicating'! And keep it short and simple, will you? Dr.Brandon: OK I will try my best to do all of the above if you promise not to doze off. The following is an excerpt from the book Mastering Machine Learning with Spark 2.x by Alex Tellez, Max Pumperla and Michal Malohlava. [/box] NLP helps analyze raw textual data and extract useful information such as sentence structure, sentiment of text, or even translation of text between languages. Since many sources of data contain raw text, (for example, reviews, news articles, and medical records). NLP is getting more and more popular, thanks to providing an insight into the text and helps make automatized decisions easier. Under the hood, NLP is often using machine-learning algorithms to extract and model the structure of text. The power of NLP is much more visible if it is applied in the context of another machine method, where, for example, text can represent one of the input features. NLP - a brief primer Just like artificial neural networks, NLP is a relatively "old" subject, but one that has garnered a massive amount of attention recently due to the rise of computing power and various applications of machine learning algorithms for tasks that include, but are not limited to, the following: Machine translation (MT): In its simplest form, this is the ability of machines to translate one language of words to another language of words. Interestingly, proposals for machine translation systems pre-date the creation of the digital computer. One of the first NLP applications was created during World War II by an American scientist named Warren Weaver whose job was to try and crack German code. Nowadays, we have highly sophisticated applications that can translate a piece of text into any number of different languages we desire!‌ Speech recognition (SR): These methodologies and technologies attempt to recognize and translate spoken words into text using machines. We see these technologies in smartphones nowadays that use SR systems in tasks ranging from helping us find directions to the nearest gas station to querying Google for the weekend's weather forecast. As we speak into our phones, a machine is able to recognize the words we are speaking and then translate these words into text that the computer can recognize and perform some task if need be. Information retrieval (IR): Have you ever read a piece of text, such as an article on a news website, for example, and wanted to see similar news articles like the one you just read? This is but one example of an information retrieval system that takes a piece of text as an "input" and seeks to obtain other relevant pieces of text similar to the input text. Perhaps the easiest and most recognizable example of an IR system is doing a search on a web-based search engine. We give some words that we want to "know" more about (this is the "input"), and the output are the search results, which are hopefully relevant to our input search query. Information extraction (IE): This is the task of extracting structured bits of information from unstructured data such as text, video and pictures. For example, when you read a blog post on some website, often, the post is tagged with a few keywords that describe the general topics about this posting, which can be classified using information extraction systems. One extremely popular avenue of IE is called Visual Information Extraction, which attempts to identify complex entities from the visual layout of a web page, for example, which would not be captured in typical NLP approaches. Text summarization (darn, no acronym here!): This is a hugely popular area of interest. This is the task of taking pieces of text of various length and summarizing them by identifying topics, for example. In the next chapter, we will explore two popular approaches to text summarization via topic models such as Latent Dirichlet Allocation (LDA) and Latent Semantic Analysis (LSA). If you enjoyed the above excerpt from the book Mastering Machine Learning with Spark 2.x by Alex Tellez, Max Pumperla, and Michal Malohlava, check out the book to learn how to Use Spark streams to cluster tweets online Utilize generated models for off-line/on-line prediction Transfer learning from an ensemble to a simpler Neural Network Use GraphFrames, an extension of DataFrames to graphs, to study graphs using an elegant query language Use K-means algorithm to cluster movie reviews dataset and more

0
0
2058

article-image-ai-autonomous-cars-might-have-just-a-four-year-endurance-lifecycle-from-ai-trends

Matthew Emerick

15 Oct 2020

14 min read

AI Autonomous Cars Might Have Just A Four-Year Endurance Lifecycle from AI Trends

Matthew Emerick

15 Oct 2020

14 min read

0
0
1963

Tech News - Artificial Intelligence

Paper in Two minutes: A novel method for resource efficient image classification

Amazon is supporting research into conversational AI with Alexa fellowships

Lyft releases an autonomous driving dataset “Level 5” and sponsors research competition

4 Clustering Algorithms every Data Scientist should know

Data Governance in Operations Needed to Ensure Clean Data for AI Projects from AI Trends

Dr. Brandon explains Word Vectors (word2vec) to Jon

AI chipmaking startup ‘Graphcore’ raises $200m from BMW, Microsoft, Bosch, Dell

Dr. Brandon explains 'Transfer Learning' to Jon

Amazon tried to sell its facial recognition technology to ICE in June, emails reveal

Automobile Repair Self-Diagnosis and Traffic Light Management Enabled by AI from AI Trends

Trending Topics

Digitizing the offline: How Alibaba’s FashionAI can revive the waning retail industry

Sony resurrects robotic pet Aibo with advanced AI

Youtube promises to reduce recommendations of ‘conspiracy theory’. Ex-googler explains why this is a 'historic victory'

Dr.Brandon explains NLP (Natural Language Processing) to Jon

AI Autonomous Cars Might Have Just A Four-Year Endurance Lifecycle from AI Trends