Tech Guides

article-image-dl-frameworks-tensorflow-vs-cntk

30 Oct 2017

6 min read

The Deep Learning Framework Showdown: TensorFlow vs CNTK

30 Oct 2017

The question several Deep Learning engineers may ask themselves is: Which is better, TensorFlow or CNTK? Well, we're going to answer that question for you, taking you through a closely fought match between the two most exciting frameworks. So, here we are, ladies and gentlemen, it's fight night and it's a full house. In the Red corner, weighing in at two hundred and seventy pounds of Python and topping out at over ten thousand frames per second; managed by the American tech giant, Google; we have the mighty, the beefy, TensorFlow! In the Blue corner, weighing in at two hundred and thirty pounds of C++ muscle, we have, one of the top toolkits that can comfortably scale beyond a single machine. Managed by none other than Microsoft, it's fast, it's furious, it's CNTK aka the Microsoft Cognitive Toolkit! And we're into Round One… TensorFlow and CNTK are looking quite menacingly at each other and are raging to take down their opponents. TensorFlow seems pleased that its compile times are considerably faster than its successor, Theano. Although, it looks like happiness came a tad bit soon. CNTK, light and bouncy on it's feet, comes straight out of nowhere with a whopping seventy thousand frames/second upper cut, knocking TensorFlow to the floor. TensorFlow looks like it's in no mood to give up anytime soon. It makes itself so simple to use and understand that even students can pick it up and start training their own models. This isn't the case with CNTK, as it begs to shed its complexity. On the other hand, CNTK seems to be thrashing TensorFlow in terms of 3D convolution, where CNTK can clearly recognize images from streaming content. TensorFlow also tries its best to run LSTM RNNs, but in vain. The crowd keeps cheering on… Wait a minute...are they calling out for TensorFlow? Yes they are! There's hardly any cheering for CNTK. This is embarrassing! Looks like its community support can't match up to TensorFlow's. And ladies and gentlemen, that does make a difference - we can see TensorFlow improving on several fronts and gradually getting back in the game! TensorFlow huffs and puffs as it tries to prove that it's not just about deep learning and that it has tools in the pocket that can support other algorithms such as reinforcement learning. It conveniently whips out the TensorBoard, and drops CNTK to the floor with its beautiful visualizations. TensorFlow now has the upper hand and is trying hard to pin CNTK to the floor and tries to use its R support to finish it off. But CNTK tactfully breaks loose and leaves TensorFlow on the floor - still not ready to be used in production. And there goes the bell for Round One! Both fighters look exhausted but you can see a faint twinkle in TensorFlow's eye, primarily because it survived Round One. Google seems to be working hard to prep it for Round Two and is making several improvements in terms of speed, flexibility and majorly making it ready for production. Meanwhile, Microsoft boosts CNTK's spirits with a shot of Python APIs in its blood. As it moves towards reaching version 2.0, there are a lot of improvements to CNTK, wherein, Microsoft has ensured that it's not left behind, like having a backend for Keras, which puts it on par with TensorFlow. Moreover, there seem to be quite a few experimental features that it looks ready to enter the ring with, like the Java API for example. It's the final round and boy, are these two into a serious stare-down! The referee waves them in and off they are. CNTK needs to get back at TensorFlow. Comfortably supporting multiple GPUs and CPUs out of the box, across both the Microsoft and Linux platforms, it has an advantage over TensorFlow. Is it going to use that trump card? Yes it is! A thousand GPUs and a hundred machines in, and CNTK is raining blows on TensorFlow. TensorFlow clearly drops the ball when it comes to multiple machines, and it rather complicates things. It's high time that TensorFlow turned the tables. Lo and behold! It shows off its mobile deep learning capabilities with TensorFlow Lite, clearly flipping CNTK flat on its back. This is revolutionary and a tremendous breakthrough for TensorFlow! CNTK, however, is clearly the people's choice when it comes to language compatibility. With support for C++, Python, C#/.NET and now Java, it's clearly winning in this area. Round Two is coming to an end, ladies and gentlemen and it's a neck to neck battle out there. We're not sure the judges are going to be able to choose a clear winner, from the looks of it. And…. there goes the bell! While the scores are being tallied, we go over to the teams and some spectators for some gossip on the what's what of deep learning. Did you know having multiple machine support is a huge advantage? It increases speed and efficiency by almost 10 times! That's something! We also got to know that TensorFlow is training hard and is picking up positives from its rival, CNTK. There are also rumors about a new kid called MXNet (read about it here), that has APIs in R, Python and even in Julia! This makes it one helluva framework in terms of flexibility and speed. In fact, AWS is already implementing it while Apple also is rumored to be using it. Clearly, something to watch out for. And finally, the judges have made their decision. Ladies and gentlemen, after two rounds of sheer entertainment, we have the results... TensorFlow CNTK Processing speed 0 1 Learning curve 1 0 Production readiness 0 1 Community support 1 0 CPU, GPU computation support 0 1 Mobile deep learning 1 0 Multiple language compatibility 0 1 It's a unanimous decision and just as we thought, CNTK is the heavyweight champion! CNTK clearly beat TensorFlow in terms of performance, because of its flexibility, speed and ability to use in production! As a Deep Learning engineer, should you be wanting to use one of these frameworks in your tasks, you should check out their features thoroughly, test them out with a test dataset and then implement them to your actual data. After all, it's the choices we make that define a win or a loss - simplicity over resource utilisation, or speed over platform, we must choose our tools wisely. For more information on the kind of tests that both the tools have been put through, read the Research Paper presented by Shaohuai Shi, Qiang Wang, Pengfei Xu and Xiaowen Chu from the Department of Computer Science, Hong Kong Baptist University and these benchmarks.

0
1
15015

article-image-ai-rescue-5-ways-machine-learning-can-assist-emergency-situations

Sugandha Lahoti

16 Jan 2018

9 min read

AI to the rescue: 5 ways machine learning can assist during emergency situations

Sugandha Lahoti

16 Jan 2018

9 min read

At the wee hours of the night, on January 4 this year, over 9.8 million people experienced a magnitude 4.4 earthquake that rumbled across the San Francisco Bay Area. This was followed by a magnitude 7.6 earthquake in the Caribbean sea on January 9, following which a tsunami advisory was in effect for Puerto Rico and the U.S. and British Virgin Islands. In the past 6 months, the United States alone has witnessed four back-to-back storms from one brutal hurricane season, and a massive wildfire with almost 2 million acres of land ablaze. Natural Disasters across the globe are increasingly becoming more damaging and frequent. Since 1970, the number of disasters worldwide has more than quadrupled to around 400 a year. These series of natural disasters have strained the emergency services and disaster relief operations beyond capacity. We now need to look for newer ways to assist the affected people and automate the recovery process. Artificial intelligence and machine learning have advanced to the state where they are highly proficient in making predictions, and in identification and classification tasks. These use cases of AI can also be applied to prevent disasters or respond quickly, in case of an emergency. Here are 5 ways how AI can lend a helping hand during emergency situations. 1. Machine Learning for targeted disaster relief management In case of any disaster, the first step is to formulate a critical response team to help those in distress. Before the team goes into action, it is important to analyze and assess the extent of damage and to ensure that the right aid goes first to those who need it the most. AI techniques such as image recognition and classification can be quite helpful in assessing the damage as they can analyze and observe images from the satellites. They can immediately and efficiently filter these images, which would have required months to be sorted manually. AI can identify objects and features such as damaged buildings, flooding, blocked roads from these images. They can also identify temporary settlements which may indicate that people are homeless, and so the first care could be directed towards them. Artificial intelligence and machine learning tools can also aggregate and crunch data from multiple resources such as crowd-sourced mapping materials or Google maps. Machine learning approaches then combine all this data together, remove unreliable data, and identify informative sources to generate heat maps. These heat maps can identify areas in need of urgent assistance and direct relief efforts to those areas. Heat maps are also helpful for government and other humanitarian agencies in deciding where to conduct aerial assessments. DigitalGlobe provides space imagery and geospatial content. Their Open Data Program is a special program for disaster response. The software learns how to recognize buildings on satellite photos by learning from the crowd. DigitalGlobe releases pre- and post-event imagery for select natural disasters each year, and their crowdsourcing platform, Tomnod, will prioritize micro-tasking to accelerate damage assessments. Following the Nepal Earthquakes in 2015, Rescue Global and academicians from the Orchid Project used machine learning to carry out rescue activities. They took pre and post-disaster imagery and utilized crowd-sourced data analysis and machine learning to identify locations affected by the quakes that had not yet been assessed or received aid. This information was then shared with relief workforces to facilitate their activities. 2. Next Generation 911 911 is the first source of contact during any emergency situation. 911 dispatch centers are already overloaded with calls on a regular day. In case of a disaster or calamity, the number gets quadrupled, or even more. This calls for augmenting traditional 911 emergency centers with newer technologies for better management. Traditional 911 centers rely on voice-based calls alone. Next-gen dispatch services are upgrading their emergency dispatch technology with machine learning to receive more types of data. So now they can ingest the data from not just calls but also from text, video, audio, and pictures, to analyze them to make quick assessments. The insights gained from all this information can be passed on to the emergency response teams out in the field to efficiently carry out critical tasks. The Association of Public-Safety Communications (APCO) have employed IBM’s Watson to listen to 911 calls. This initiative is to help emergency call centers improve operations and public safety by using Watson’s speech-to-text and analytics programs. Using Watson's speech-to-text function, the context of each call is fed into the AI's analytics program allowing improvements in how call centers respond to emergencies. It also helps in reducing call times, provide accurate information, and help accelerate time-sensitive emergency services. 3. Sentiment analysis on social media data for disaster management and recovery Social media channels are a major source of news in present times. Some of the most actionable information, during a disaster, comes from social media users. Real-time images and comments from Facebook, Twitter, Instagram, and YouTube can be analyzed and validated by AI to filter real information from fake ones. These vital stats can help on-the-ground aid workers to reach the point of crisis sooner and direct their efforts to the needy. This data can also help rescue workers in reducing the time needed to find victims. In addition, AI and predictive analytics software can analyze digital content from Twitter, Facebook, and Youtube to provide early warnings, ground-level location data, and real-time report verification. In fact, AI could also be used to view the unstructured data and background of pictures and videos posted to social channels and compare them to find missing people. AI-powered chatbots can help residents affected by a calamity. The chatbot can interact with the victim, or other citizens in the vicinity via popular social media channels and ask them to upload information such as location, a photo, and some description. The AI can then validate and check this information from other sources and pass on the relevant details to the disaster relief committee. This type of information can assist them with assessing damage in real time and help prioritize response efforts. AI for Digital Response (AIDR) is a free and open platform which uses machine intelligence to automatically filter and classify social media messages related to emergencies, disasters, and humanitarian crises. For this, it uses a Collector and a Tagger. The Collector helps in collecting and filtering tweets using keywords and hashtags such as "cyclone" and "#Irma," for example. The Collector works as a word-filter. The Tagger is a topic-filter which classifies tweets by topics of interest, such as "Infrastructure Damage," and "Donations," for example. The Tagger automatically applies the classifier to incoming tweets collected in real-time using the Collector. 4. AI answers distress and help-calls Emergency relief services are flooded with distress and help calls in the event of any emergency situation. Managing such a huge amount of calls is time-consuming and expensive when done manually. The chances of a critical information being lost or unobserved is also a possibility. In such cases, AI can work as a 24/7 dispatcher. AI systems and voice assistants can analyze massive amounts of calls, determine what type of incident occurred and verify the location. They can not only interact with callers naturally and process those calls, but can also instantly transcribe and translate languages. AI systems can analyze the tone of voice for urgency, filtering redundant or less urgent calls and prioritizing them based on the emergency. Blueworx is a powerful IVR platform which uses AI to replace call center officials. Using AI technology is especially useful when unexpected events such as natural disasters drive up call volume. Their AI engine is well suited to respond to emergency calls as unlike a call center agent, it can know who a customer is even before they call. It also provides intelligent call routing, proactive outbound notifications, unified messaging, and Interactive Voice Response. 5. Predictive analytics for proactive disaster management Machine learning and other data science approaches are not limited to assisting the on-ground relief teams or assisting only after the actual emergency. Machine learning approaches such as predictive analytics can also analyze past events to identify and extract patterns and populations vulnerable to natural calamities. A large number of supervised and unsupervised learning approaches are used to identify at-risk areas and improve predictions of future events. For instance, clustering algorithms can classify disaster data on the basis of severity. They can identify and segregate climatic patterns which may cause local storms with the cloud conditions which may lead to a widespread cyclone. Predictive machine learning models can also help officials distribute supplies to where people are going, rather than where they were by analyzing real-time behavior and movement of people. In addition, predictive analytics techniques can also provide insight for understanding the economic and human impact of natural calamities. Artificial neural networks take in information such as region, country, and natural disaster type to predict the potential monetary impact of natural disasters. Recent advances in cloud technologies and numerous open source tools have enabled predictive analytics with almost no initial infrastructure investment. So agencies with limited resources can also build systems based on data science and develop more sophisticated models to analyze disasters. Optima Predict, a suite of software by Intermedix collects and reads information about disasters such as viral outbreaks or criminal activity in real time. The software spots geographical clusters of reported incidents before humans notice the trend and then alerts key officials about it. The data can also be synced with FirstWatch, which is an online dashboard for an EMS (Emergency Medical Services) personnel. Thanks to the multiple benefits of AI, government agencies and NGOs can start utilizing machine learning to deal with disasters. As AI and allied fields like robotics further develop and expand, we may see a fleet of drone services, equipped with sophisticated machine learning. These advanced drones could expedite access to real-time information at disaster sites using video capturing capabilities and also deliver lightweight physical goods to hard to reach areas. As with every progressing technology, AI will also build on its existing capabilities. It has the potential to eliminate outages before they are detected and give disaster response leaders an informed, clearer picture of the disaster area, ultimately saving lives.

0
0
14927

article-image-best-machine-learning-datasets-for-beginners

Natasha Mathur

19 Sep 2018

13 min read

Best Machine Learning Datasets for beginners

Natasha Mathur

19 Sep 2018

13 min read

“It’s not who has the best algorithm that wins. It’s who has the most data” ~ Andrew Ng If you would look at the way algorithms were trained in Machine Learning, five or ten years ago, you would notice one huge difference. Training algorithms in Machine Learning are much better and efficient today than it used to be a few years ago. All credit goes to the hefty amount of data that is available to us today. But, how does Machine Learning make use of this data? Let’s have a look at the definition of Machine Learning. “Machine Learning provides computers or machines the ability to automatically learn from experience without being explicitly programmed”. Machines “learn from experience” when they’re trained, this is where data comes into the picture. How’re they trained? Datasets! This is why it is so crucial that you feed these machines with the right data for whatever problem it is that you want these machines to solve. Why datasets matter in Machine Learning? The simple answer is because Machines too like humans are capable of learning once they see relevant data. But where they vary from humans is the amount of data they need to learn from. You need to feed your machines with enough data in order for them to do anything useful for you. This why Machines are trained using massive datasets. We can think of machine learning data like a survey data, meaning the larger and more complete your sample data size is, the more reliable your conclusions will be. If the data sample isn’t large enough then it won’t be able to capture all the variations making your machine reach inaccurate conclusions, learn patterns that don’t really exist, or not recognize patterns that do. Datasets help bring the data to you. Datasets train the model for performing various actions. They model the algorithms to uncover relationships, detect patterns, understand complex problems as well as make decisions. Apart from using datasets, it is equally important to make sure that you are using the right dataset, which is in a useful format and comprises all the meaningful features, and variations. After all, the system will ultimately do what it learns from the data. Feeding right data into your machines also assures that the machine will work effectively and produce accurate results without any human interference required. For instance, training a speech recognition system with a textbook English dataset will result in your machine struggling to understand anything but textbook English. So, any loose grammar, foreign accents, or speech disorders would get missed out. For such a system, using a dataset comprising all the infinite variations in a spoken language among speakers of different genders, ages, and dialects would be a right option. So keep in mind that it is important that the quality, variety, and quantity of your training data is not compromised as all these factors help determine the success of your machine learning models. Top Machine Learning Datasets for Beginners Now, there are a lot of datasets available today for use in your ML applications. It can be confusing, especially for a beginner to determine which dataset is the right one for your project. It is better to use a dataset which can be downloaded quickly and doesn’t take much to adapt to the models. Further, always use standard datasets that are well understood and widely used. This lets you compare your results with others who have used the same dataset to see if you are making progress. You can pick the dataset you want to use depending on the type of your Machine Learning application. Here’s a rundown of easy and the most commonly used datasets available for training Machine Learning applications across popular problem areas from image processing to video analysis to text recognition to autonomous systems. Image Processing There are many image datasets to choose from depending on what it is that you want your application to do. Image processing in Machine Learning is used to train the Machine to process the images to extract useful information from it. For instance, if you’re working on a basic facial recognition application then you can train it using a dataset that has thousands of images of human faces. This is how Facebook knows people in group pictures. This is also how image search works in Google and in other visual search based product sites. Dataset Name Brief Description 10k US Adult Faces Database This database consists of 10,168 natural face photographs and several measures for 2,222 of the faces, including memorability scores, computer vision, and psychological attributes. The face images are JPEGs with 72 pixels/in resolution and 256-pixel height. Google's Open Images Open Images is a dataset of 9 million URLs to images which have been annotated with labels spanning over 6000 categories. These labels cover more real-life entities and the images are listed as having a Creative Commons Attribution license. Visual Genome This is a dataset of over 100k images densely annotated with numerous region descriptions ( girl feeding elephant), objects (elephants), attributes(large), and relationships (feeding). Labeled Faces in the Wild This database comprises more than 13,000 images of faces collected from the web. Each face is labeled with the name of the person pictured. Fun and easy ML application ideas for beginners using image datasets: Cat vs Dogs: Using Cat and Stanford Dogs dataset to classify whether an image contains a dog or a cat. Iris Flower classification: You can build an ML project using Iris flower dataset where you classify the flowers in any of the three species. What you learn from this toy project will help you learn to classify physical attributes based content to build some fun real-world projects like fraud detection, criminal identification, pain management ( eg; ePAT which detects facial hints of pain using facial recognition technology), and so on. Hot dog - Not hot dog: Use the Food 101 dataset, to distinguish different food types as a hot dog or not. Who knows, you could end up becoming the next Emmy award nominee! Sentiment Analysis As a beginner, you can create some really fun applications using Sentiment Analysis dataset. Sentiment Analysis in Machine Learning applications is used to train machines to analyze and predict the emotion or sentiment associated with a sentence, word, or a piece of text. This is used in movie or product reviews often. If you are creative enough, you could even identify topics that will generate the most discussions using sentiment analysis as a key tool. Dataset Name Brief Description Sentiment140 A popular dataset, which uses 160,000 tweets with emoticons pre-removed Yelp Reviews An open dataset released by Yelp, contains more than 5 million reviews on Restaurants, Shopping, Nightlife, Food, Entertainment, etc. Twitter US Airline Sentiment Twitter data on US airlines starting from February 2015, labeled as positive, negative, and neutral tweets. Amazon reviews This dataset contains over 35 million reviews from Amazon spanning 18 years. Data include information on products, user ratings, and the plaintext review. Easy and Fun Application ideas using Sentiment Analysis Dataset: Positive or Negative: Using Sentiment140 dataset in a model to classify whether given tweets are negative or positive. Happy or unhappy: Using Yelp Reviews dataset in your project to help machine figure out whether the person posting the review is happy or unhappy. Good or Bad: Using Amazon Reviews dataset, you can train a machine to figure out whether a given review is good or bad. Natural Language Processing Natural language processing deals with training machines to process and analyze large amounts of natural language data. This is how search engines like Google know what you are looking for when you type in your search query. Use these datasets to make a basic and fun NLP application in Machine Learning: Dataset Name Brief Description Speech Accent Archive This dataset comprises 2140 speech samples from different talkers reading the same reading passage. These Talkers come from 177 countries and have 214 different native languages. Each talker is speaking in English. Wikipedia Links data This dataset consists of almost 1.9 billion words from more than 4 million articles. Search is possible by word, phrase or part of a paragraph itself. Blogger Corpus A dataset comprising 681,288 blog posts gathered from blogger.com. Each blog consists of minimum 200 occurrences of commonly used English words. Fun Application ideas using NLP datasets: Spam or not: Using Spambase dataset, you can enable your application to figure out whether a given email is spam or not. Video Processing Video Processing datasets are used to teach machines to analyze and detect different settings, objects, emotions, or actions and interactions in videos. You’ll have to feed your machine with a lot of data on different actions, objects, and activities. Dataset Name Brief Description UCF101 - Action Recognition Data Set This dataset comes with 13,320 videos from 101 action categories. Youtube 8M YouTube-8M is a large-scale labeled video dataset. It contains millions of YouTube video IDs, with high-quality machine-generated annotations from a diverse vocabulary of 3,800+ visual entities. Fun Application ideas using video processing dataset: Action detection: Using UCF101 - Action Recognition DataSet, or Youtube 8M, you can train your application to detect the actions such as walking, running etc, in a video. Speech Recognition Speech recognition is the ability of a machine to analyze or identify words and phrases in a spoken language. Feed your machine with the right and good amount of data, and it will help it in the process of recognizing speech. Combine speech recognition with natural language processing, and get Alexa who knows what you need. Dataset Name Brief Description Gender Recognition by Voice and speech analysis This database identifies a voice as male or female, depending on the acoustic properties of voice and speech. The dataset contains 3,168 recorded voice samples, collected from male and female speakers. Human Activity Recognition w/Smartphone Human Activity Recognition database consists of recordings of 30 subjects performing activities of daily living (ADL) while carrying a smartphone ( Samsung Galaxy S2 ) on the waist. TIMIT TIMIT provides speech data for acoustic-phonetic studies and for the development of automatic speech recognition systems. It comprises broadband recordings of 630 speakers of eight major dialects of American English, each reading ten phonetically rich sentences, phonetic and word transcriptions. Speech Accent Archive This dataset contains 2140 speech samples, each from a different talker reading the same reading passage. Talkers come from 177 countries and have 214 different native languages. Each talker is speaking in English. Fun Application ideas using Speech Recognition dataset: Accent detection: Use Speech Accent Archive dataset, to make your application identify different accents from a given sample of accents. Identify the activity: Use Human Activity Recognition w/Smartphone dataset to help your application detect the human activity. Natural Language Generation Natural Language generation refers to the ability of machines to simulate the human speech. It can be used to translate written information into aural information or assist the vision-impaired by reading out aloud the contents of a display screen. This is how Alexa or Siri respond to you. Dataset Name Brief Description Common Voice by Mozilla Common Voice dataset contains speech data read by users on the Common Voice website from a number of public sources like user-submitted blog posts, old books, movies, etc. LibriSpeech This dataset consists of nearly 500 hours of clean speech of various audiobooks read by multiple speakers, organized by chapters of the book with both the text and the speech. Fun Application ideas using Natural Language Generation dataset: Converting text into Audio: Using Blogger Corpus dataset, you can train your application to read out loud the posts on blogger. Autonomous Driving Build some basic self-driving Machine Learning Applications. These Self-driving datasets will help you train your machine to sense its environment and navigate accordingly without any human interference. Autonomous cars, drones, warehouse robots, and others use these algorithms to navigate correctly and safely in the real world. Datasets are even more important here as the stakes are higher and the cost of a mistake could be a human life. Dataset Name Brief Description Berkeley DeepDrive BDD100k This is one of the largest datasets for self-driving AI currently. It comprises over 100,000 videos of over 1,100-hour driving experiences across different times of the day and weather conditions. Baidu Apolloscapes Large dataset consisting of 26 different semantic items such as cars, bicycles, pedestrians, buildings, street lights, etc. Comma.ai This dataset consists of more than 7 hours of highway driving. It includes details on car’s speed, acceleration, steering angle, and GPS coordinates. Cityscape Dataset This is a large dataset that contains recordings of urban street scenes in 50 different cities. nuScenes This dataset consists of more than 1000 scenes with around 1.4 million image, 400,000 sweeps of lidars (laser-based systems that detect the distance between objects), and 1.1 million 3D bounding boxes ( detects objects with a combination of RGB cameras, radar, and lidar). Fun Application ideas using Autonomous Driving dataset: A basic self-driving application: Use any of the self-driving datasets mentioned above to train your application with different driving experiences for different times and weather conditions. IoT Machine Learning in building IoT applications is on the rise these days. Now, as a beginner in Machine Learning, you may not have advanced knowledge on how to build these high-performance IoT applications using Machine Learning, but you certainly can start off with some basic datasets to explore this exciting space. Dataset Name Brief Description Wayfinding, Path Planning, and Navigation Dataset This dataset consists of samples of trajectories in an indoor building (Waldo Library at Western Michigan University) for navigation and wayfinding applications. ARAS Human Activity Dataset This dataset is a Human activity recognition Dataset collected from two real houses. It involves over 26 millions of sensor readings and over 3000 activity occurrences. Fun Application ideas using IoT dataset: Wearable device to track human activity: Use the ARAS Human Activity Dataset to train a wearable device to identify human activity. Read Also: 25 Datasets for Deep Learning in IoT Once you’re done going through this list, it’s important to not feel restricted. These are not the only datasets which you can use in your Machine Learning Applications. You can find a lot many online which might work best for the type of Machine Learning Project that you’re working on. Some popular sources of a wide range of datasets are Kaggle, UCI Machine Learning Repository, KDnuggets, Awesome Public Datasets, and Reddit Datasets Subreddit. With all this information, it is now time to use these datasets in your project. In case you’re completely new to Machine Learning, you will find reading, ‘A nonprogrammer’s guide to learning Machine learning’quite helpful. Regardless of whether you’re a beginner or not, always remember to pick a dataset which is widely used, and can be downloaded quickly from a reliable source. How to create and prepare your first dataset in Salesforce Einstein Google launches a Dataset Search Engine for finding Datasets on the Internet Why learn machine learning as a non-techie?

0
0
14884

Packt Editorial Staff

29 Mar 2018

4 min read

The evolution of cybercrime

Packt Editorial Staff

29 Mar 2018

4 min read

A history of cybercrime As computer systems have now become integral to the daily functioning of businesses, organizations, governments, and individuals we have learned to put a tremendous amount of trust in these systems. As a result, we have placed incredibly important and valuable information on them. History has shown, that things of value will always be a target for a criminal. Cybercrime is no different. As people flood their personal computers, phones, and so on with valuable data, they put a target on that information for the criminal to aim for, in order to gain some form of profit from the activity. In the past, in order for a criminal to gain access to an individual's valuables, they would have to conduct a robbery in some shape or form. In the case of data theft, the criminal would need to break into a building, sifting through files looking for the information of greatest value and profit. In our modern world, the criminal can attack their victims from a distance, and due to the nature of the internet, these acts would most likely never meet retribution. Cybercrime in the 70s and 80s In the 70s, we saw criminals taking advantage of the tone system used on phone networks. The attack was called phreaking, where the attacker reverse-engineered the tones used by the telephone companies to make long distance calls. In 1988, the first computer worm made its debut on the internet and caused a great deal of destruction to organizations. This first worm was called the Morris worm, after its creator Robert Morris. While this worm was not originally intended to be malicious it still caused a great deal of damage. The U.S. Government Accountability Office in 1980 estimated that the damage could have been as high as $10,000,000.00. 1989 brought us the first known ransomware attack, which targeted the healthcare industry. Ransomware is a type of malicious software that locks a user's data, until a small ransom is paid, which will result in the issuance of a cryptographic unlock key. In this attack, an evolutionary biologist named Joseph Popp distributed 20,000 floppy disks across 90 countries, and claimed the disk contained software that could be used to analyze an individual's risk factors for contracting the AIDS virus. The disk however contained a malware program that when executed, displayed a message requiring the user to pay for a software license. Ransomware attacks have evolved greatly over the years with the healthcare field still being a very large target. The birth of the web and a new dawn for cybercrime The 90s brought the web browser and email to the masses, which meant new tools for cybercriminals to exploit. This allowed the cybercriminal to greatly expand their reach. Up till this time, the cybercriminal needed to initiate a physical transaction, such as providing a floppy disk. Now cybercriminals could transmit virus code over the internet in these new, highly vulnerable web browsers. Cybercriminals took what they had learned previously and modified it to operate over the internet, with devastating results. Cybercriminals were also able to reach out and con people from a distance with phishing attacks. No longer was it necessary to engage with individuals directly. You could attempt to trick millions of users simultaneously. Even if only a small percentage of people took the bait you stood to make a lot of money as a cybercriminal. The 2000s brought us social media and saw the rise of identity theft. A bullseye was painted for cybercriminals with the creation of databases containing millions of users' personal identifiable information (PII), making identity theft the new financial piggy bank for criminal organizations around the world. This information coupled with a lack of cybersecurity awareness from the general public allowed cybercriminals to commit all types of financial fraud such as opening bank accounts and credit cards in the name of others. Cybercrime in a fast-paced technology landscape Today we see that cybercriminal activity has only gotten worse. As computer systems have gotten faster and more complex we see that the cybercriminal has become more sophisticated and harder to catch. Today we have botnets, which are a network of private computers that are infected with malicious software and allow the criminal element to control millions of infected computer systems across the globe. These botnets allow the criminal element to overload organizational networks and hide the origin of the criminals: We see constant ransomware attacks across all sectors of the economy People are constantly on the lookout for identity theft and financial fraud Continuous news reports regarding the latest point of sale attack against major retailers and hospitality organizations This is an extract from Information Security Handbook by Darren Death. Follow Darren on Twitter: @DarrenDeath.

0
2
14837

article-image-iterative-machine-learning-step-towards-model-accuracy

Amarabha Banerjee

01 Dec 2017

10 min read

Iterative Machine Learning: A step towards Model Accuracy

Amarabha Banerjee

01 Dec 2017

10 min read

Learning something by rote i.e., repeating it many times, perfecting a skill by practising it over and over again or building something by making minor adjustments progressively to a prototype are things that comes to us naturally as human beings. Machines can also learn this way and this is called ‘Iterative machine learning’. In most cases, iteration is an efficient learning approach that helps reach the desired end results faster and accurately without becoming a resource crunch nightmare. Now, you might wonder, isn’t iteration inherently part of any kind of machine learning? In other words, modern day machine learning techniques across the spectrum from basic regression analysis, decision trees, Bayesian networks, to advanced neural nets and deep learning algorithms have some inherent iterative component built into them. What is the need, then, for discussing iterative learning as a standalone topic? This is simply because introducing iteration externally to an algorithm can minimize the error margin and therefore help in accurate modelling. How Iterative Learning works Let’s understand how iteration works by looking closely at what happens during a single iteration flow within a machine learning algorithm. A pre-processed training dataset is first introduced into the model. After processing and model building with the given data, the model is tested, and then the results are matched with the desired result/expected output. The feedback is then returned back to the system for the algorithm to further learn and fine tune its results. This clearly shows that two iteration processes take place here: Data Iteration - Inherent to the algorithm Model Training Iteration - Introduced externally Now, what if we did not feedback the results into the system i.e. did not allow the algorithm to learn iteratively but instead adopted a sequential approach? Would the algorithm work and would it provide the right results? Yes, the algorithm would definitely work. However, the quality of the results it produces is going to vary vastly based on a number of factors. The quality and quantity of the training dataset, the feature definition and extraction techniques employed, the robustness of the algorithm itself are among many other factors. Even if all of the above were done perfectly, there is still no guarantee that the results produced by a sequential approach will be highly accurate. In short, the results will neither be accurate nor reproducible. Iterative learning thus allows algorithms to improve model accuracy. Certain algorithms have iteration central to their design and can be scaled as per the data size. These algorithms are at the forefront of machine learning implementations because of their ability to perform faster and better. In the following sections we will discuss iteration in different sets of algorithms each from the three main machine learning approaches - supervised ML, unsupervised ML and reinforcement learning. The Boosting algorithms: Iteration in supervised ML The boosting algorithms, inherently iterative in nature, are a brilliant way to improve results by minimizing errors. They are primarily designed to reduce bias in results and transform a particular set of weak learning classifier algorithms to strong learners and to enable them to reduce errors. Some examples are: AdaBoost (Adaptive Boosting) Gradient Tree Boosting XGBoost How they work All boosting algorithms have a common classifiers which are iteratively modified to reach the desired result. Let’s take the example of finding cases of plagiarism in a certain article. The first classifier here would be to find a group of words that appear somewhere else or in another article which would result in a red flag. If we create 10 separate group of words and term them as classifiers 1 to 10, then our article will be checked on the basis of this classifier and any possible matches will be red flagged. But no red flags with these 10 classifiers would not mean a definite 100% original article. Thus, we would need to update the classifiers, create shorter groups perhaps based on the first pass and improve the accuracy with which the classifiers can find similarity with other articles. This iteration process in Boosting algorithms eventually leads us to a fairly high rate of accuracy. The reason being after each iteration, the classifiers are updated based on their performance. The ones which have close similarity with other content are updated and tweaked so that we can get a better match. This process of improving the algorithm inherently, is termed as boosting and is currently one of the most popular methods in Supervised Machine Learning. Strengths & weaknesses The obvious advantage of this approach is that it allows minimal errors in the final model as the iteration enables the model to correct itself every time there is an error. The downside is the higher processing time and the overall memory requirement for a large number of iterations. Another important aspect is that the error fed back to train the model is done externally, which means the supervisor has control over the model and how it modifies. This in turn has a downside that the model doesn’t learn to eliminate error on its own. Hence, the model is not reusable with another set of data. In other words, the model does not learn how to become error-free by itself and hence cannot be ported to another dataset as it would need to start the learning process from scratch. Artificial Neural Networks: Iteration in unsupervised ML Neural Networks have become the poster child for unsupervised machine learning because of their accuracy in predicting data models. Some well known neural networks are: Convolutional Neural Networks Boltzmann Machines Recurrent Neural Networks Deep Neural Networks Memory Networks How they work Artificial neural networks are highly accurate in simulating data models mainly because of their iterative process of learning. But this process is different from the one we explored earlier for Boosting algorithms. Here the process is seamless and natural and in a way it paves the way for reinforcement learning in AI systems. Neural Networks consist of electronic networks simulating the way the human brain is works. Every network has an input and output node and in-between hidden layers that consist of algorithms. The input node is given the initial data set to perform a set of actions and each iteration creates a result that is output as a string of data. This output is then matched with the actual result dataset and the error is then fed back to the input node. This error then enables the algorithms to correct themselves and reach closer and closer to the actual dataset. This process is called training the Neural Networks and each iteration improve the accuracy. The key difference between the iteration performed here as compared to how it is performed by Boosting algorithms is that here we don’t have to update the classifiers manually, the algorithms change themselves based on the error feedback. Strengths & weaknesses The main advantage of this process is obviously the level of accuracy that it can achieve on its own. The model is also reusable because it learns the means to achieve accuracy and not just gives you a direct result. The flip side of this approach is that the models can go wrong heavily and deviate completely in a different direction. This is because the induced iteration takes its own course and doesn’t need human supervision. The facebook chat-bots deviating from their original goal and communicating within themselves in a language of their own is a case in point. But as is the saying, smart things come with their own baggage. It’s a risk we would have to be ready to tackle if we want to create more accurate models and smarter systems. Reinforcement Learning Reinforcement learning is a interesting case of machine learning where the simple neural networks are connected and together they interact with the environment to learn from their mistakes and rewards. The iteration introduced here happens in a complex way. The iteration happens in the form of reward or punishment for arriving at the correct or wrong results respectively. After each interaction of this kind, the multilayered neural networks incorporate the feedback, and then recreate the models for better accuracy. The typical type of reward and punishment method somewhat puts it in a space where it is neither supervised nor unsupervised, but exhibits traits of both and also has the added advantage of producing more accurate results. The con here is that the models are complex by design. Multilayered neural networks are difficult to handle in case of multiple iterations because each layer might respond differently to a certain reward or punishment. As such it may create inner conflict that might lead to a stalled system - one that can’t decide which direction to move next. Some Practical Implementations of Iteration Many modern day machine learning platforms and frameworks have implemented the iteration process on their own to create better data models, Apache Spark and MapR are two such examples. The way the two implement iteration is technically different and they have their merits and limitations. Let’s look at MapReduce. It reads and writes data directly onto HDFS filesystem present on the disk. Note that for every iteration to be read and written from the disk needs significant time. This in a way creates a more robust and fault tolerant system but compromises on the speed. On the other hand, Apache Spark stores the data in memory (Resilient Distributed DataSet) i.e. in the RAM. As a result, each iteration takes much less time which enables Spark to perform lightning fast data processing. But the primary problem with the Spark way of doing iteration is that dynamic memory or RAM is much less reliable than disk memory to store iteration data and perform complex operations. Hence it’s much less fault tolerant that MapR. Bringing it together To sum up the discussion, we can look at the process of iteration and its stages in implementing machine learning models roughly as follows: Parameter Iteration: This is the first and inherent stage of iteration for any algorithm. The parameters involved in a certain algorithm are run multiple times and the best fitting parameters for the model are finalized in this process. Data Iteration: Once the model parameters are finalized, the data is put into the system and the model is simulated. Multiple sets of data are put into the system to check the parameters’ effectiveness in bringing out the desired result. Hence, if data iteration stage suggests that some of the parameters are not well suited for the model, then they are taken back to the parameter iteration stage and parameters are added or modified. Model Iteration: After the initial parameters and data sets are finalized, the model testing/ training happens. The iteration in model testing phase is all about running the same model simulation multiple times with the same parameters and data set, and then checking the amount of error, if the error varies significantly in every iteration, then there is something wrong with either the data or the parameter or both. Iterations are done to data and parameters until the model achieves accuracy. Human Iteration: This step involves the human induced iteration where different models are put together to create a fully functional smart system. Here, multiple levels of fitting and refitting happens to achieve a coherent overall goal such as creating a driverless car system or a fully functional AI. Iteration is pivotal to creating smarter AI systems in the near future. The enormous memory requirements for performing multiple iterations on complex data sets continue to pose major challenges. But with increasingly better AI chips, storage options and data transfer techniques, these challenges are getting easier to handle. We believe iterative machine learning techniques will continue to lead the transformation of the AI landscape in the near future.

0
0
14794

article-image-cloud-computing-services-iaas-paas-saas

Amey Varangaonkar

07 Aug 2018

4 min read

Types of Cloud Computing Services: IaaS, PaaS, and SaaS

Amey Varangaonkar

07 Aug 2018

4 min read

Cloud computing has risen massively in terms of popularity in recent times. This is due to the way it reduces on-premise infrastructure cost and improves efficiency. Primarily, the cloud model has been divided into three major service categories: Infrastructure as a Service (IaaS) Platform as a Service (PaaS) Software as a Service (SaaS) We will discuss each of these instances in the following sections: The article is an excerpt taken from the book 'Cloud Analytics with Google Cloud Platform', written by Sanket Thodge. Infrastructure as a Service (IaaS) Infrastructure as a Service often provides the infrastructure such as servers, virtual machines, networks, operating system, storage, and much more on a pay-as-you-use basis. IaaS providers offer VM from small to extra-large machines. The IaaS gives you complete freedom while choosing the instance type as per your requirements: Common cloud vendors providing the IaaS services are: Google Cloud Platform Amazon Web Services IBM HP Public Cloud Platform as a Service (PaaS) The PaaS model is similar to IaaS, but it also provides the additional tools such as database management system, business intelligence services, and so on. The following figure illustrates the architecture of the PaaS model: Cloud platforms providing PaaS services are as follows: Windows Azure Google App Engine Cloud Foundry Amazon Web Services Software as a Service (SaaS) Software as a Service (SaaS) makes the users connect to the products through the internet (or sometimes also help them build in-house as a private cloud solution) on a subscription basis model. Below image shows the basic architecture of SaaS model. Some cloud vendors providing SaaS are: Google Application Salesforce Zoho Microsoft Office 365 Differences between SaaS, PaaS, and IaaS The major differences between these models can be summarized to a table as follows: Software as a Service (SaaS) Platform as a Service (PaaS) Infrastructure as a Service (IaaS) Software as a service is a model in which a third-party provider hosts multiple applications and lets customers use them over the internet. SaaS is a very useful pay-as-you-use model. Examples: Salesforce, NetSuite This is a model in which a third-party provider application development platform and services built on its own infrastructure. Again these tools are made available to customers over the internet. Examples: Google App Engine, AWS Lambda In IaaS, a third-party application provides servers, storage, compute resources, and so on. And then makes it available for customers for their utilization. Customers can use IaaS to build their own PaaS and SaaS service for their customers. Examples: Google Cloud Compute, Amazon S3 How PaaS, IaaS, and SaaS are separated at a service level In this section, we are going to learn about how we can separate IaaS, PaaS, and SaaS at the service level: As the previous diagram suggests, we have the first column as OPS, which stands for operations. That means the bare minimum requirement for any typical server. When we are going with a server to buy, we should consider the preceding features before buying. It includes Application, Data, Runtime, Framework, Operating System, Server, Disk, and Network Stack. When we move to the cloud and decide to go with IaaS—in this case, we are not bothered about the server, disk, and network stack. Thus, the headache of handling hardware part is no more with us. That's why it is called Infrastructure as a Service. Now if we think of PaaS, we should not be worried about runtime, framework, and operating system along with the components in IaaS. Things that we need to focus on are only application and data. And the last deployment model is SaaS—Software as a Service. In this model, we are not concerned about literally anything. The only thing that we need to work on is the code and just a look at the bill. It's that simple! If you found the above excerpt useful, make sure to check out the book 'Cloud Analytics with Google Cloud Platform' for more of such interesting insights into Google Cloud Platform. Read more Top 5 cloud security threats to look out for in 2018 Is cloud mining profitable? Why AWS is the prefered cloud platform for developers working with big data?

0
0
14564

article-image-tensorflow-always-tops-machine-learning-artificial-intelligence-tool-surveys

Sunith Shetty

23 Aug 2018

9 min read

Why TensorFlow always tops machine learning and artificial intelligence tool surveys

Sunith Shetty

23 Aug 2018

9 min read

TensorFlow is an open source machine learning framework for carrying out high-performance numerical computations. It provides excellent architecture support which allows easy deployment of computations across a variety of platforms ranging from desktops to clusters of servers, mobiles, and edge devices. Have you ever thought, why TensorFlow has become so popular in such a short span of time? What made TensorFlow so special, that we seeing a huge surge of developers and researchers opting for the TensorFlow framework? Interestingly, when it comes to artificial intelligence frameworks showdown, you will find TensorFlow emerging as a clear winner most of the time. The major credit goes to the soaring popularity and contributions across various forums such as GitHub, Stack Overflow, and Quora. The fact is, TensorFlow is being used in over 6000 open source repositories showing their roots in many real-world research and applications. How TensorFlow came to be The library was developed by a group of researchers and engineers from the Google Brain team within Google AI organization. They wanted a library that provides strong support for machine learning and deep learning and advanced numerical computations across different scientific domains. Since the time Google open sourced its machine learning framework in 2015, TensorFlow has grown in popularity with more than 1500 projects mentions on GitHub. The constant updates made to the TensorFlow ecosystem is the real cherry on the cake. This has ensured all the new challenges developers and researchers face are addressed, thus easing the complex computations and providing newer features, promises, and performance improvements with the support of high-level APIs. By open sourcing the library, the Google research team have received all the benefits from a huge set of contributors outside their existing core team. Their idea was to make TensorFlow popular by open sourcing it, thus making sure all new research ideas are implemented in TensorFlow first allowing Google to productize those ideas. Read Also: 6 reasons why Google open sourced TensorFlow What makes TensorFlow different from the rest? With more and more research and real-life use cases going mainstream, we can see a big trend among programmers, and developers flocking towards the tool called TensorFlow. The popularity for TensorFlow is quite evident, with big names adopting TensorFlow for carrying out artificial intelligence tasks. Many popular companies such as NVIDIA, Twitter, Snapchat, Uber and more are using TensorFlow for all their major operations and research areas. On one hand, someone can make a case that TensorFlow’s popularity is based on its origins/legacy. TensorFlow being developed under the house of “Google” enjoys the reputation of the household name. There’s no doubt, TensorFlow has been better marketed than some of its competitors. Source: The Data Incubator However that’s not the full story. There are many other compelling reasons why small scale to large scale companies prefer using TensorFlow over other machine learning tools TensorFlow key functionalities TensorFlow provides an accessible and readable syntax which is essential for making these programming resources easier to use. The complex syntax is the last thing developers need to know given machine learning’s advanced nature. TensorFlow provides excellent functionalities and services when compared to other popular deep learning frameworks. These high-level operations are essential for carrying out complex parallel computations and for building advanced neural network models. TensorFlow is a low-level library which provides more flexibility. Thus you can define your own functionalities or services for your models. This is a very important parameter for researchers because it allows them to change the model based on changing user requirements. TensorFlow provides more network control. Thus allowing developers and researchers to understand how operations are implemented across the network. They can always keep track of new changes done over time. Distributed training The trend of distributed deep learning began in 2017, when Facebook released a paper showing a set of methods to reduce the training time of a convolutional neural network model. The test was done on RESNET-50 model on ImageNet dataset which took one hour to train instead of two weeks. 256 GPUs spread over 32 servers were used. This revolutionary test has open the gates for many research work which have massively reduced the experimentation time by running many tasks in parallel on multiple GPUs. Google’s distributed TensorFlow has allowed all the researchers and developers to scale out complex distributed training using in-built methods and operations that optimizes distributed deep learning among servers. . Google’s distributed TensorFlow engine which is part of the regular TensorFlow repo, works exceptionally well with the existing TensorFlow’s operations and functionalities. It has allowed exploring two of the most important distributed methods: Distribute the training time of a neural network model over many servers to reduce the training time. Searching for good hyperparameters by running parallel experiments over multiple servers. Google has given distributed TensorFlow engine the required power to steal the share of the market acquired by other distributed projects such as Microsoft’s CNTK, AMPLab's SparkNet, and CaffeOnSpark. Even though the competition is tough, Google has still managed to become more popular when compared to the other alternatives in the market. From research to production Google has, in some ways, democratized deep learning., The key reason is TensorFlow’s high-level APIs making deep learning accessible to everyone. TensorFlow provides pre-built functions and advanced operations to ease the task of building different neural network models. It provides the required infrastructure and hardware which makes them one of the leading libraries used extensively by researchers and students in the deep learning domain. In addition to research tools, TensorFlow extends the services by bringing the model in production using TensorFlow Serving. It is specifically designed for production environments, which provides a flexible, high-performance serving system for machine learning models. It provides all the functionalities and operations which makes it easy to deploy new algorithms and experiments as per changing requirements and preferences. It provides an excellent feature of out-of-the-box integration with TensorFlow models which can be easily extended to serve other types of models and data. TensorFlow’s API is a complete package which is easier to use and read, plus provides helpful operators, debugging and monitoring tools, and deployment features. This has led to growing use of TensorFlow library as a complete package within the ecosystem by the emerging body of students, researchers, developers, production engineers from various fields who are gravitating towards artificial intelligence. There is a TensorFlow for web, mobile, edge, embedded and more TensorFlow provides a range of services and modules within their existing ecosystem making them as one of the ground-breaking end-to-end tools to provide state-of-the-art deep learning. TensorFlow.js for machine learning on the web JavaScript library for training and deploying machine learning models in the browser. This library provides flexible and intuitive APIs to build and train new and pre-existing models from scratch right in the browser or under Node.js. TensorFlow Lite for mobile and embedded ML It is a TensorFlow lightweight solution used for mobile and embedded devices. It is fast since it enables on-device machine learning inference with low latency. It supports hardware acceleration with the Android Neural Networks API. The future releases of TensorFlow Lite will bring more built-in operators, performance improvements, and will support more models to simplify the developer’s experience of bringing machine learning services within mobile devices. TensorFlow Hub for reusable machine learning A library which is used extensively to reuse machine learning models. Thus you can transfer learning by reusing parts of machine learning models. TensorBoard for visual debugging While training a complex neural network model, the computations you use in TensorFlow can be very confusing. TensorBoard makes it very easy to understand and debug your TensorFlow programs in the form of visualizations. It allows you to easily inspect and understand your TensorFlow runs and graphs. Sonnet Sonnet is a DeepMind library which is built on top of TensorFlow extensively used to build complex neural network models. All of this factors have made the TensorFlow library immensely appealing for building a wide spectrum of machine learning and deep learning projects. This tool has become a preferred choice for everyone from space research giant NASA and other confidential government agencies, to an impressive roster of private sector giants. Road Ahead for TensorFlow TensorFlow no doubt is better marketed compared to the other deep learning frameworks. The community appears to be moving very fast. In any given hour, there are approximately 10 people around the world contributing or improving the TensorFlow project on GitHub. TensorFlow dominates the field with the largest active community. It will be interesting to see what new advances TensorFlow and other utilities make possible for the future of our digital world. Continuing the recent trend of rapid updates, the TensorFlow team is making sure they address all the current and active challenges faced by the contributors and the developers while building machine learning and deep learning models. TensorFlow 2.0 will be a major update, we can expect the release candidate by next year early March. The preview version of this major milestone is expected to hit later this year. The major focus will be on ease of use, additional support for more platforms and languages, and eager execution will be the central feature of TensorFlow 2.0. This breakthrough version will add more functionalities and operations to handle current research areas such as reinforcement learning, GANs, building advanced neural network models more efficiently. Google will continue to invest and upgrade their existing TensorFlow ecosystem. According to Google’s CEO, Sundar Pichai “artificial intelligence is more important than electricity or fire.” TensorFlow is the solution they have come up with to bring artificial intelligence into reality and provide a stepping stone to revolutionize humankind. Read more The 5 biggest announcements from TensorFlow Developer Summit 2018 The Deep Learning Framework Showdown: TensorFlow vs CNTK Tensor Processing Unit (TPU) 3.0: Google’s answer to cloud-ready Artificial Intelligence

0
0
14303

article-image-nvidia-leads-the-ai-hardware-race-but-which-of-its-gpus-should-you-use-for-deep-learning

Prasad Ramesh

29 Aug 2018

8 min read

NVIDIA leads the AI hardware race. But which of its GPUs should you use for deep learning?

Prasad Ramesh

29 Aug 2018

8 min read

For readers who are new to deep learning and who might be wondering what a GPU is, let’s start there. To make it simple, consider deep learning as nothing more than a set of calculations - complex calculations, yes, but calculations nonetheless. To run these calculations, you need hardware. Ordinarily, you might just use a normal processor like the CPU inside your laptop. However, this isn’t powerful enough to process at the speed at which deep learning computations need to happen. GPUs, however, can. This is because while a conventional CPU has only a few complex cores, a GPU can have thousands of simple cores. With a GPU, training a deep learning data set can take just hours instead of days. However, although it’s clear that GPUs have significant advantages over CPUs, there is nevertheless a range of GPUs available, each having their own individual differences. Selecting one is ultimately a matter of knowing what your needs are. Let’s dig deeper and find out how to go about shopping for GPUs… What to look for before choosing a GPU? There are a few specifications to consider before picking a GPU. Memory bandwidth: This determines the capacity of a GPU to handle large amounts of data. It is the most important performance metric, as with faster memory bandwidth more data can be processed at higher speeds. Number of cores: This indicates how fast a GPU can process data. A large number of CUDA cores can handle large datasets well. CUDA cores are parallel processors similar to cores in a CPU but their number is in thousands and are not suited for complex calculations that a CPU core can perform. Memory size: For computer vision projects, it is crucial for memory size to be as much as you can afford. But with natural language processing, memory size does not play such an important role. Our pick of GPU devices to choose from The go to choice here is NVIDIA; they have standard libraries that make it simple to set things up. Other graphics cards are not very friendly in terms of the libraries supported for deep learning. NVIDIA CUDA Deep Neural Network library also has a good development community. “Is NVIDIA Unstoppable In AI?” -Forbes “Nvidia beats forecasts as sales of graphics chips for AI keep booming” -SiliconANGLE AMD GPUs are powerful too but lack library support to get things running smoothly. It would be really nice to see some AMD libraries being developed to break the monopoly and give more options to the consumers. NVIDIA RTX 2080 Ti: The RTX line of GPUs are to be released in September 2018. The RTX 2080 Ti will be twice as fast as the 1080 Ti. Price listed on NVIDIA website for founder’s edition is $1,199. RAM: 11 GB Memory bandwidth: 616 GBs/second Cores: 4352 cores @ 1545 MHz NVIDIA RTX 2080: This is more cost efficient than the 2080 Ti at a listed price of $799 on NVIDIA website for the founder’s edition. RAM: 8 GB Memory bandwidth: 448 GBs/second Cores: 2944 cores @ 1710 MHz NVIDIA RTX 2070: This is more cost efficient than the 2080 Ti at a listed price of $599 on NVIDIA website. Note that the other versions of the RTX cards will likely be cheaper than the founder’s edition around a $100 difference. RAM: 8 GB Memory bandwidth: 448 GBs/second Cores: 2304 cores @ 1620 MHz NVIDIA GTX 1080 Ti: Priced at $650 on Amazon. This is a higher end option but offers great value for money, and can also do well in Kaggle competitions. If you need more memory but cannot afford the RTX 2080 Ti go for this. RAM: 11 GB Memory bandwidth: 484 GBs/second Cores: 3584 cores @ 1582 MHz NVIDIA GTX 1080: Priced at $584 on Amazon. This is a mid-high end option only slightly behind the 1080Ti. VRAM: 8 GB Memory bandwidth: 320 GBs/second Processing power: 2560 cores @ 1733 MHz NVIDIA GTX 1070 Ti: Priced at around $450 on Amazon. This is slightly less performant than the GTX 1080 but $100 cheaper. VRAM: 8 GB Memory bandwidth: 256 GBs/second Processing power: 2438 cores @ 1683 MHz NVIDIA GTX 1070: Priced at $380 on Amazon is currently the bestseller because of crypto miners. Somewhat slower than the 1080 GPUs but cheaper. VRAM: 8 GB Memory bandwidth: 256 GBs/second Processing power: 1920 cores @ 1683 MHz NVIDIA GTX 1060 6GB: Priced at around $290 on Amazon. Pretty cheap but the 6 GB VRAM limits you. Should be good for NLP but you’ll find the performance lacking in computer vision. VRAM: 6 GB Memory bandwidth: 216 GBs/second Processing power: 1280 cores @ 1708 MHz NVIDIA GTX 1050 Ti: Priced at around $200 on Amazon. This is the cheapest workable option. Good to get started with deep learning and explore if you’re new. VRAM: 4 GB Memory bandwidth: 112 GBs/second Processing power: 768 cores @ 1392 MHz NVIDIA Titan XP: The Titan XP is also an option but gives only a marginally better performance while being almost twice as expensive as the GTX 1080 Ti, it has 12 GB memory, 547.7 GB/s bandwidth and 3840 cores @ 1582 MHz. On a side note, NVIDIA Quadro GPUs are pretty expensive and don’t really help in deep learning they are more of use in CAD and working with heavy graphics production tasks. The graph below does a pretty good job of visualizing how all the GPUs above compare: Source: Slav Ivanov Blog, processing power is calculated as CUDA cores times the clock frequency Does the number of GPUs matter? Yes, it does. But how many do you really need? What’s going to suit the scale of your project without breaking your budget? 2 GPUs will always yield better results than just one - but it’s only really worth it if you need the extra power. There are two options you can take with multi-GPU deep learning. On the one hand, you can train several different models at once across your GPUs, or, alternatively distribute one single training model across multiple GPUs known as “multi-GPU training”. The latter approach is compatible with TensorFlow, CNTK, and PyTorch. Both of these approaches have advantages. Ultimately, it depends on how many projects you’re working on and, again, what your needs are. Another important point to bear in mind is that if you’re using multiple GPUs, the processor and hard disk need to be fast enough to feed data continuously - otherwise the multi-GPU approach is pointless. Source: NVIDIA website It boils down to your needs and budget, GPUs aren’t exactly cheap. Other heavy devices There are also other large machines apart from GPUs. These include the specialized supercomputer from NVIDIA, the DGX-2, and Tensor processing units (TPUs) from Google. The NVIDIA DGX-2 If you thought GPUs were expensive, let me introduce you to NVIDIA DGX-2, the successor to the NVIDIA DGX-1. It’s a highly specialized workstation; consider it a supercomputer that has been specially designed to tackle deep learning. The price of the DGX-2 is (*gasp*) $399,000. Wait, what? I could buy some new hot wheels for that, or Dual Intel Xeon Platinum 8168, 2.7 GHz, 24-cores, 16 NVIDIA GPUs, 1.5 terabytes of RAM, and nearly 32 terabytes of SSD storage! The performance here is 2 petaFLOPS. Let’s be real: many of us probably won’t be able to afford it. However, NVIDIA does have leasing options, should you choose to try it. Practically speaking, this kind of beast finds its use in research work. In fact, the first DGX-1 was gifted to OpenAI by NVIDIA to promote AI research. Visit the NVIDIA website for more on these monster machines. There are also personal solutions available like the NVIDIA DGX Workstation. TPUs Now that you’ve caught your breath after reading about AI dream machines, let’s look at TPUs. Unlike the DGX machines, TPUs run on the cloud. A TPU is what’s referred to as an application-specific integrated circuit (ASIC) that has been designed specifically for machine learning and deep learning by Google. Here’s the key stats: Cloud TPUs can provide up to 11.5 petaflops of performance in a single pod. If you want to learn more, go to Google’s website. When choosing GPUs you need to weigh up your options The GTX 1080 Ti is most commonly used by researchers and competitively for Kaggle, as it gives good value for money. Go for this if you are sure about what you want to do with deep learning. The GTX 1080 and GTX 1070 Ti are cheaper with less computing power, a more budget friendly option if you cannot afford the 1080 Ti. GTX 1070 saves you some more money but is slower. The GTX 1060 6GB and GTX 1050 Ti are good if you’re just starting off in the world of deep learning without burning a hole in your pockets. If you must have the absolute best GPU irrespective of the cost then the RTX 2080 Ti is your choice. It offers twice the performance for almost twice the cost of a 1080 Ti. Nvidia unveils a new Turing architecture: “The world’s first ray tracing GPU” Nvidia GPUs offer Kubernetes for accelerated deployments of Artificial Intelligence workloads Nvidia’s Volta Tensor Core GPU hits performance milestones. But is it the best?

0
0
14209

Aaron Lazar

23 Apr 2018

5 min read

Why is Hadoop dying?

Aaron Lazar

23 Apr 2018

5 min read

Hadoop has been the definitive big data platform for some time. The name has practically been synonymous with the field. But while its ascent followed the trajectory of what was referred to as the 'big data revolution', Hadoop now seems to be in danger. The question is everywhere - is Hadoop dying out? And if it is, why is it? Is it because big data is no longer the buzzword it once was, or are there simply other ways of working with big data that have become more useful? Hadoop was essential to the growth of big data When Hadoop was open sourced in 2007, it opened the door to big data. It brought compute to data, as against bringing data to compute. Organisations had the opportunity to scale their data without having to worry too much about the cost. It obviously had initial hiccups with security, the complexity of querying and querying speeds, but all that was taken care off, in the long run. Still, although querying speeds remained quite a pain, however that wasn’t the real reason behind Hadoop dying (slowly). As cloud grew, Hadoop started falling One of the main reasons behind Hadoop's decline in popularity was the growth of cloud. There cloud vendor market was pretty crowded, and each of them provided their own big data processing services. These services all basically did what Hadoop was doing. But they also did it in an even more efficient and hassle-free way. Customers didn't have to think about administration, security or maintenance in the way they had to with Hadoop. One person’s big data is another person’s small data Well, this is clearly a fact. Several organisations that used big data technologies without really gauging the amount of data they actually would need to process, have suffered. Imagine sitting with 10TB Hadoop clusters when you don’t have that much data. The two biggest organisations that built products on Hadoop, Hortonworks and Cloudera, saw a decline in revenue in 2015, owing to their massive use of Hadoop. Customers weren’t pleased with nature of Hadoop’s limitations. Apache Hadoop v Apache Spark Hadoop processing is way behind in terms of processing speed. In 2014 Spark took the world by storm. I’m going to let you guess which line in the graph above might be Hadoop, and which might be Spark. Spark was a general purpose, easy to use platform that was built after studying the pitfalls of Hadoop. Spark was not bound to just the HDFS (Hadoop Distributed File System) which meant that it could leverage storage systems like Cassandra and MongoDB as well. Spark 2.3 was also able to run on Kubernetes; a big leap for containerized big data processing in the cloud. Spark also brings along GraphX, which allows developers to view data in the form of graphs. Some of the major areas Spark wins are Iterative Algorithms in Machine Learning, Interactive Data Mining and Data Processing, Stream processing, Sensor data processing, etc. Machine Learning in Hadoop is not straightforward Unlike MLlib in Spark, Machine Learning is not possible in Hadoop unless tied with a 3rd party library. Mahout used to be quite popular for doing ML on Hadoop, but its adoption has gone down in the past few years. Tools like RHadoop, a collection of 3 R packages, have grown for ML, but it still is nowhere comparable to the power of the modern day MLaaS offerings from cloud providers. All the more reason to move away from Hadoop, right? Maybe. Hadoop is not only Hadoop The general misconception is that Hadoop is quickly going to be extinct. On the contrary, the Hadoop family consists of YARN, HDFS, MapReduce, Hive, Hbase, Spark, Kudu, Impala, and 20 other products. While e folks may be moving away from Hadoop as their choice for big data processing, they will still be using Hadoop in some form or the other. As with Cloudera and Hortonworks, though the market has seen a downward trend, they’re in no way letting go of Hadoop anytime soon, although they have shifted part of their processing operations to Spark. Is Hadoop dying? Perhaps not... In the long run, it’s not completely accurate to say that Hadoop is dying. December last year brought with it Hadoop 3.0, which is supposed to be a much improved version of the framework. Some of the most noteworthy features are its improved shell script, more powerful YARN, improved fault tolerance with erasure coding, and many more. Although, that hasn’t caused any major spike in adoption, there are still users who will adopt Hadoop based on their use case, or simply use another alternative like Spark along with another framework from the Hadoop family. So, Hadoop’s not going away anytime soon. Read More Pandas is an effective tool to explore and analyze data - Interview insights

0
1
13807

article-image-what-makes-functional-programming-a-viable-choice-for-artificial-intelligence-projects

Prasad Ramesh

14 Sep 2018

7 min read

What makes functional programming a viable choice for artificial intelligence projects?

Prasad Ramesh

14 Sep 2018

7 min read

0
0
13806

article-image-top-14-cryptocurrency-trading-bots

Guest Contributor

21 Jun 2018

9 min read

Top 14 Cryptocurrency Trading Bots - and one to forget

Guest Contributor

21 Jun 2018

9 min read

0
1
13794

article-image-what-is-interactive-machine-learning

Amey Varangaonkar

23 Jul 2018

4 min read

What is interactive machine learning?

Amey Varangaonkar

23 Jul 2018

4 min read

Machine learning is a useful and effective tool to have when it comes to building prediction models or to build a useful data structure from an avalanche of data. Many ML algorithms are in use today for a variety of real-world use cases. Given a sample dataset, a machine learning model can give predictions with only certain accuracy, which largely depends on the quality of the training data fed to it. Is there a way to increase the prediction accuracy by somehow involving humans in the process? The answer is yes, and the solution is called as ‘Interactive Machine Learning’. Why we need interactive machine learning As we already discussed above, a model can give predictions only as good as the quality of the training data fed to it. If the quality of the training data is not good enough, the model might: Take more time to learn and then give accurate predictions Quality of predictions will be very poor This challenge can be overcome by involving humans in the machine learning process. By incorporating human feedback in the model training process, it can be trained faster and more efficiently to give more accurate predictions. In the widely adopted machine learning approaches, including supervised and unsupervised learning or even active learning for that matter, there is no way to include human feedback in the training process to improve the accuracy of predictions. In case of supervised learning, for example, the data is already pre-labelled and is used without any actual inputs from the human during the training process. For this reason alone, the concept of interactive machine learning is seen by many machine learning and AI experts as a breakthrough. How interactive machine learning works Machine Learning Researchers Teng Lee, James Johnson and Steve Cheng have suggested a novel way to include human inputs to improve the performance and predictions of the machine learning model. It has been called as the ‘Transparent Boosting Tree’ algorithm, which is a very interesting approach to combine the advantages of machine learning and human inputs in the final decision making process. The Transparent Boosting Tree, or TBT in short, is an algorithm that would visualize the model and the prediction details of each step in the machine learning process to the user, take his/her feedback, and incorporate it into the learning process. The ML model is in charge of updating the assigned weights to the inputs, and filtering the information shown to the user for his/her feedback. Once the feedback is received, it can be incorporated by the ML model as a part of the learning process, thus improving it. A basic flowchart of the interactive machine learning process is as shown: Interactive Machine Learning More in-depth information on how interactive machine learning works can be found in their paper. What can Interactive machine learning do for businesses With the rising popularity and applications of AI across all industry verticals, humans may have a key role to play in the learning process of an algorithm, apart from just coding it. While observing the algorithm’s own outputs or evaluations in the form of visualizations or plain predictions, humans can suggest way to to improve that prediction by giving feedback in the form of inputs such as labels, corrections or rankings. This helps the models in two ways: Increases the prediction accuracy Time taken for the algorithm to learn is shortened considerably Both the advantages can be invaluable to businesses, as they look to incorporate AI and machine learning in their processes, and look for faster and more accurate predictions. Interactive Machine Learning is still in its nascent stage and we can expect more developments in the domain to surface in the coming days. Once production-ready, it will undoubtedly be a game-changer. Read more Active Learning: An approach to training machine learning models efficiently Anatomy of an automated machine learning algorithm (AutoML) How machine learning as a service is transforming cloud

0
0
13279

article-image-how-to-build-a-location-based-augmented-reality-app

Guest Contributor

22 Nov 2018

7 min read

How to build a location-based augmented reality app

Guest Contributor

22 Nov 2018

7 min read

0
0
13273

Sam Wood

14 Oct 2015

4 min read

A Brief History of Python

Sam Wood

14 Oct 2015

4 min read

From data to web development, Python has come to stand as one of the most important and most popular open source programming languages being used today. But whilst some see it as almost a new kid on the block, Python is actually older than both Java, R, and JavaScript. So what are the origins of our favorite open source language? In the beginning... Python's origins lie way back in distant December 1989, making it the same age as Taylor Swift. Created by Guido van Rossum (the Python community's Benevolent Dictator for Life) as a hobby project to work on during week around Christmas, Python is famously named not after the constrictor snake but rather the British comedy troupe Monty Python's Flying Circus. (We're quite thankful for this at Packt - we have no idea what we'd put on the cover if we had to pick for 'Monty' programming books!) Python was born out of the ABC language, a terminated project of the Dutch CWI research institute that van Rossum worked for, and the Amoeba distributed operating system. When Amoeba needed a scripting language, van Rossum created Python. One of the principle strengths of this new language was how easy it was to extend, and its support for multiple platforms - a vital innovation in the days of the first personal computers. Capable of communicating with libraries and differing file formats, Python quickly took off. Computer Programming for Everybody Python grew throughout the early nineties, acquiring lambda, reduce(), filter() and map() functional programming tools (supposedly courtesy of a Lisp hacker who missed them and thus submitted working patches), key word arguments, and built in support for complex numbers. During this period, Python also served a central role in van Rossum's Computer Programming for Everybody initiative. The CP4E's goal was to make programming more accessible to the 'layman' and encourage a basic level of coding literacy as an equal essential knowledge alongside English literacy and math skills. Because of Python's focus on clean syntax and accessibility, it played a key part in this. Although CP4E is now inactive, learning Python remains easy and Python is one of the most common languages that new would-be programmers are pointed at to learn. Going Open with 2.0 As Python grew in the nineties, one of the key issues in uptake was its continued dependence on van Rossum. 'What if Guido was hit by a bus?' Python users lamented, 'or if he dropped dead of exhaustion or if he is rubbed out by a member of a rival language following?' In 2000, Python 2.0 was released by the BeOpen Python Labs team. The ethos of 2.0 was very much more open and community oriented in its development process, with much greater transparency. Python moved its repository to SourceForge, granting write access to its CVS tree more people and an easy way to report bugs and submit patches. As the release notes stated, 'the most important change in Python 2.0 may not be to the code at all, but to how Python is developed'. Python 2.7 is still used today - and will be supported until 2020. But the word from development is clear - there will be no 2.8. Instead, support remains focused upon 2.7's usurping younger brother - Python 3. The Rise of Python 3 In 2008, Python 3 was released on an almost-unthinkable premise - a complete overhaul of the language, with no backwards compatibility. The decision was controversial, and born in part of the desire to clean house on Python. There was a great emphasis on removing duplicative constructs and modules, to ensure that in Python 3 there was one - and only one - obvious way of doing things. Despite the introduction of tools such as '2to3' that could identify quickly what would need to be changed in Python 2 code to make it work in Python 3, many users stuck with their classic codebases. Even today, there is no assumption that Python programmers will be working with Python 3. Despite flame wars raging across the Python community, Python 3's future ascendancy was something of an inevitability. Python 2 remains a supported language (for now). But as much as it may still be the default choice of Python, Python 3 is the language's future. The Future Python's userbase is vast and growing - it's not going away any time soon. Utilized by the likes of Nokia, Google, and even NASA for it's easy syntax, it looks to have a bright future ahead of it supported by a huge community of OS developers. Its support of multiple programming paradigms, including object-oriented Python programming, functional Python programming, and parallel programming models makes it a highly adaptive choice - and its uptake keeps growing.

0
0
13210

article-image-a-machine-learning-roadmap-for-web-developers

Sugandha Lahoti

27 Aug 2018

7 min read

A Machine learning roadmap for Web Developers

Sugandha Lahoti

27 Aug 2018

7 min read

Now that you’ve opened this article, I’ll assume you’re a web developer who is all excited with the prospect of building a machine learning project. You may be here for one of these reasons. Either you have been in a circle of people who find web development is dying? (Is it really dying or just unwell?). Or maybe you are stagnating in your current trajectory. And so, you want to learn something different, something trending, something like Artificial Intelligence. Or you/your employer/your client is aware of the capabilities of machine learning and want to include it in some part of your web app to make it more powerful. Or like the majority of the folks, you just want to see first hand if all the fuss about artificial intelligence is really worth all the effort to switch gears, by building a side toy ML project. Either way, there are different approaches to fulfill these needs. Learning Machine Learning for the Web with Javascript Learning machine learning coming from a web development background comes with its own constraints. You might worry about having to learn entirely different concepts from scratch - from different algorithms to programming languages like Python to mathematical concepts like linear algebra, calculus, and statistics. However, chances are you can skip learning a new language. You probably know some Javascript in some form or the other thanks to your web development experience. As such, you can learn Machine Learning in JavaScript (You don’t have to learn another programming language from scratch) and take it right to your browsers with WebGL. There are some advantages to using JavaScript for ML. Its popularity is one; while ML in JavaScript is not as popular as Python’s ML ecosystem, at the moment, the language itself is. As demand for ML applications rises, and as hardware becomes faster and cheaper, it's only natural for machine learning to become more prevalent in the JavaScript world. The JavaScript ecosystem offers a rich set of libraries suited for most Machine Learning tasks. Math: math.js Data Analysis: d3.js Server: node.js (express, koa, hapi) Performance: Tensorflow.js (e.g. GPU accelerated via WebGL API in the browser), Keras.js etc. Read also: 5 JavaScript machine learning libraries you need to know BRIIM is a good collection of materials to get you started as web developer or JavaScript enthusiast in machine learning. In case you’re interested in learning Python instead of Javascript, here are the set of libraries you should pick. Math: numpy Data Analysis: Pandas Data Mining: PySpark Server: Flask, Django Performance: TensorFlow (because it is written with a Python API over a C/C++ engine) or Keras (sits on top of TensorFlow). Using Machine Learning as a service If you don’t want to spend your time learning frameworks, tools, and languages suited for machine learning, you can adopt Machine Learning as a service or MLaaS. These services provide machine learning tools as part of cloud computing services. So basically, you can benefit from machine learning without the allied cost, time and risk of establishing an in-house internal machine learning team. All you need is sufficient knowledge of incorporating APIs. All Machine Learning tasks including data pre-processing, model training, model evaluation, and predictions can be completed through MLaaS. Read also: How machine learning as a service is transforming cloud A large number of companies provide Machine Learning as a service. Most prominent ones include: Amazon Machine Learning Amazon ML makes it easy for web developers to build smart applications using simple APIs. This includes applications for fraud detection, demand forecasting, targeted marketing, and click prediction. They provide a Developer Guide, which provides a conceptual overview of Amazon ML and includes detailed instructions for using the service. They also have a API reference, which describes all the API operations and provides sample requests and responses for supported web service protocols. Azure ML web app templates The web app templates available in the Azure Marketplace can build a custom web app that knows your web service's input data and expected results. All you need to do is give the web app access to your web service and data, and the template does the rest. There are two available templates: Azure ML Request-Response Service Web App Template Azure ML Batch Execution Service Web App Template Each template creates a sample ASP.NET application by using the API URI and key for your web service. The template then deploys the application as a website to Azure. No coding is required to use these templates. You just supply the API key and URI, and the template builds the application for you. Google Cloud based APIs Google also provides machine learning services, with pre-trained models and a service to generate your own tailored models. Google’s Cloud AutoML is a suite of machine learning products that enables developers with limited machine learning expertise to train high-quality models specific to their business needs. Cloud AutoML is used by Disney on their website shopDisney to enhance guest experience through more relevant search results, expedited discovery, and product recommendations. Building Conversational Interfaces As a web developer, another thing you might be looking into, is developing conversational interfaces or chatbots to enhance your web apps. Amazon, Google, and Microsoft provide Machine learning powered tools to help developers with building their own chatbots. Amazon Lex You can embed chatbots in your web apps with the Amazon Lex featuring ASR (Automatic Speech Recognition) and NLP (Natural Language Processing) capabilities. The API can recognize written and spoken text and the Lex interface allows you to hook the recognized inputs to various back-end solutions. Lex currently supports deploying chatbots for Facebook Messenger, Slack, and Twilio. Google Dialogflow Google’s Dialogflow can build voice and text-based conversational interfaces, such as voice apps and chatbots, powered by AI. Dialogflow incorporates Google's machine learning expertise and products such as Google Cloud Speech-to-Text. The API can be tweaked and customized for needed intents using Java, Node.js, and Python. It is also available as an enterprise edition. Microsoft Azure Cognitive Services Microsoft Cognitive Services simplify a variety of AI-based tasks, giving you a quick way to add intelligence technologies to your bots with just a few lines of code. It provides tools and APIs for aiding the development of conversational interfaces. These include: Translator Speech API Bing Speech API to convert text into speech and speech into text Speaker Recognition API for voice verification tasks Custom Speech Service to apply Azure NLP capacities using own data and models Language Understanding Intelligent Service (LUIS) is an API that analyzes intentions in text to be recognized as commands Text Analysis API for sentiment analysis and defining topics Bing Spell Check Translator Text API Web Language Model API that estimates probabilities of words combinations and supports word autocompletion Linguistic Analysis API used for sentence separation, tagging the parts of speech, and dividing texts into labeled phrases Read also: Top 4 chatbot development frameworks for developers These tools should be enough to get your feet off the ground quickly and move into the specific area of machine learning. Ultimately your choice of tool relies on the kind of application you want to build, your level of expertise, and how much time and effort you’re willing to put to learn. Obviously, depending on your area of choice, you would have to do more research and develop yourself in those areas. How should web developers learn machine learning? 5 examples of Artificial Intelligence in Web apps The most valuable skills for web developers to learn in 2018

0
0
13184

The Deep Learning Framework Showdown: TensorFlow vs CNTK

AI to the rescue: 5 ways machine learning can assist during emergency situations

Best Machine Learning Datasets for beginners

The evolution of cybercrime

Iterative Machine Learning: A step towards Model Accuracy

Types of Cloud Computing Services: IaaS, PaaS, and SaaS

Why TensorFlow always tops machine learning and artificial intelligence tool surveys

NVIDIA leads the AI hardware race. But which of its GPUs should you use for deep learning?

Why is Hadoop dying?

What makes functional programming a viable choice for artificial intelligence projects?

Trending Topics

Top 14 Cryptocurrency Trading Bots - and one to forget

What is interactive machine learning?

How to build a location-based augmented reality app

A Brief History of Python

A Machine learning roadmap for Web Developers