Data | 0 articles | Tech News, Tutorials & Expert Insights

article-image-nvidia-brings-new-deep-learning-updates-at-cvpr-conference

20 Jun 2018

4 min read

NVIDIA brings new deep learning updates at CVPR conference

20 Jun 2018

NVIDIA team has announced a new set of deep learning updates on their cloud computing software and hardware front during Computer Vision and Pattern Recognition Conference (CVPR 2018) held in Salt Lake City. Some of the key announcements made during the CVPR conference include Apex, an early release of a new open-source PyTorch extension, NVIDIA DALI and NVIDIA nvJPEG for efficient data optimization and image decoding, Kubernetes on NVIDIA GPUs release candidate, and runtime engine TensorRT version 4. Let’s look at some noteworthy updates made during CVPR conference: Apex Apex is an open-source PyTorch extension that includes all the required NVIDIA-maintained utilities to provide optimized and efficient mixed precision results and distributed training in PyTorch. This new extension helps machine learning engineers and data scientists to maximize deep learning training performance on NVIDIA Volta GPUs. The core promise of Apex is to provide up-to-date utilities to users as quickly as possible. Some of the notable features included are: NVIDIA PyTorch team has been inspired by the state of the art mixed precision training in tasks such as sentiment analysis, translational networks, and image classification. This has allowed them to create a set of tools to bring these methods to all levels of PyTorch users. Apex provides mixed precision utilities which are designed to improve training speed while maintaining the accuracy and stability of training in single precision. With Apex, you will now only require four or fewer line changes to the existing code to provide automatic loss scaling, automated execution of operations on FP16 or FP32, and automatic handling of master parameter conversion. In order to install/use Apex in your own development environment, you will require CUDA 9, PyTorch 0.4 or later, and Python 3. The extension is still in their early release, so we can expect the modules and utilities to undergo changes. If you want to download the code and get started with the tutorials and examples, you can visit the GitHub page. You can visit the official announcement page for more details. NVIDIA DALI and NVIDIA nvJPEG NVIDIA is using the power of GPUs with NVIDIA DALI, which utilizes the NVIDIA nvJPEG library to work on images at greater speed. This allows one to deal with performance bottleneck issues faced during image recognition and while decoding in deep learning powered computer vision applications. NVIDIA DALI is an open-source GPU-accelerated data augmentation and image loading library which can be used to optimize data pipelines (data optimization) of deep learning frameworks. You can refer to the GitHub page to learn more. NVIDIA nvJPEG is a GPU-accelerated library for JPEG decoding. You can download the release candidate for feedback and testing. This new update allows deep learning practitioners and researchers to have optimized training performance on image classification models such as ResNet-50 with MXNet, TensorFlow, and PyTorch across Amazon Web Services P3 8 GPU instances or DGX-1 systems with Volta GPUs. You can refer to the official announcement page for more details. Kubernetes on NVIDIA GPUs NVIDIA team has announced a release candidate of Kubernetes on NVIDIA GPUs which is freely available to developers for testing. This allows the enterprise to scale up training and ease up deployment to multi-cloud GPU clusters smoothly. This will ensure automated deployment, maintenance, and proper scheduling and operations of multiple GPU accelerated containers across clusters of nodes. You can arrange the growing resources on heterogeneous GPU clusters. To know more about this update, you can refer to the official announcement page. TensorRT 4 This new release of inference optimizer and runtime engine adds new layers such as recurrent neural networks, multilayer perceptrons, ONNX parser, and integration with TensorFlow to ease deep learning tasks. Moreover, it also provides the ability to execute custom neural network layers using FP16 precision and support for the Xavier SoC through NVIDIA DRIVE AI platforms. TensorRT ensures speeding up deep learning tasks such as machine translation, speech and image processing, recommender systems on GPUs. Using TensorRT across these application areas speed up the process 45x to 190x. All members of NVIDIA registered developer program can use TensorRT 4 for free. For more detailed information about the new features and updates, you can visit the developer’s official page. Read more NVIDIA open sources NVVL, library for machine learning training Nvidia’s Volta Tensor Core GPU hits performance milestones. But is it the best? Nvidia Tesla V100 GPUs publicly available in beta on Google Compute Engine and Kubernetes Engine

0
0
2375

article-image-a-new-geometric-deep-learning-extension-library-for-pytorch-releases

Sunith Shetty

19 Jun 2018

2 min read

A new geometric deep learning extension library for PyTorch releases!

Sunith Shetty

19 Jun 2018

2 min read

0
0
4602

article-image-the-most-valuable-skills-for-web-developers-to-learn-in-2018

Natasha Mathur

18 Jun 2018

7 min read

The most valuable skills for web developers to learn in 2018

Natasha Mathur

18 Jun 2018

7 min read

Machine learning is gradually transforming the development landscape. Being the hottest technology in the software industry currently, everyone from professionals to beginners, are hopping on the machine learning bandwagon. Machine learning is filled with immense potential, paving the way for people to build cutting-edge applications across different domains. This is why application developers have started to incorporate parts of machine learning into their development process to make it more effective. A web or an app developer who knows ML has a competitive edge over the one who doesn’t. In this year’s Skill Up 2018 Survey, we asked developers about the most valuable skill they would want to adopt and the answer was: Machine Learning. Source: Packt Skill Up Survey 2018 But, how does machine learning help with the web and app development process? Impact of Machine Learning on web & app development Self-driving cars, robots, face detectors, etc, all have a common denominator: Machine Learning. These are some popular areas, we have seen ML models create wonders by identifying the best and the worst of the user-generated content to make it highly valuable experience on the web. But machine learning is everywhere. How can we not remember Machine learning to help us find out and eradicate web spam which used to damage user experience? Google’s artificial neural network helped in email spam filtering which has blocked almost 99% of spam emails from reaching our inboxes. Companies like Pinterest and Instagram use ML to show ever interesting and engaging content on their apps. Another example is of Uber app which uses Machine Learning to create a seamless and reliable experience for customers. With advanced technologies like ML & AI used for designing the Uber app, helps estimate the time of arrival and cost of travel. It also helps in providing real-time information about the driver’s location to the customers. Among other areas, Uber uses ML to enable an efficient ride-sharing marketplace, identify suspicious or fraudulent accounts, suggest optimal pickup and dropoff points and even facilitate UberEATS delivery. Machine learning has the potential to take development skills to the next level. So if you want to be a versatile developer, ML, no longer has to be a skill that you put on the back-burner. However, that's not all to the story, there are plenty of such examples where companies use ML to build their products. And there are plenty of reasons and opportunities for web developers to dive into machine learning. Let us take a look at each one by one: Machine learning for data mining Organizations across the globe use different data mining techniques to examine their large database in order to discover new information. ML can be used for data mining since it is quite effective in detecting new patterns based on huge amounts of data. It uses pattern recognition techniques and computational learning for data prediction. Web developers can leverage web mining technique which is a subset of data mining. It uncovers distinct usage patterns from web data to understand and better serve the needs of Web-based applications. It helps developers discover useful data such as users’ browsing history and the origin of the web users. Web structure mining can further help developers to analyze nodes and connection structure of a website to describe HTML or XML tags usage. Comprehending customer behavior Web apps and other mobile apps make use of supervised machine learning algorithms to address issues faced by the user. This, in turn, helps ameliorate the entire customer service process. For instance, contact us forms are quite prevalent on websites these days. Contact us forms eliminate the need for the users to self-select an issue and fill out ceaseless form fields to get in touch with the customer care executive. All you have to do now is fill in the contact us form and you’ll hear back from the respective customer care center. This helps streamline the customer service process. Another great example is Chatbots. Chatbots helps website or an app to better understand the patterns in customer behavior. What do customers search for the most? What is customer’s buying tendency? What problems are they facing? These questions can be easily answered by a chatbot which is built on machine learning algorithms. As a developer, you will feel overwhelmed by having developed such innovative solutions and enhance the whole process. Personalizing content The number one example of machine learning helping developers personalize the content within their sites is Facebook. In fact, several social media applications are heavily leveraging the potential of machine learning to provide users with more personalized and relevant content. Facebook uses ML in the form of automatic friend tagging suggestions, mutual friend analysis, personalized news feed, and video recommendations as per user’s choice. It uses a combination of predictive analytics and statistical analysis to detect patterns based on the user’s data to create an ever engaging content. Recently, Twitter also started implementing machine learning algorithms to value user’s time by providing them with a deeply personalized feed which is custom-tailored as per the user’s choice and liking. Machine learning has become game-changer for the social media websites and developers should grab this opportunity with both their hands. Dealing with security threats Machine learning technique such as logistic regression can help developers find and evaluate websites that are malicious in nature. Another such machine learning algorithm is called classification algorithm. It can help detect and predict phishing websites. Detection of phishing websites depends on factors such as security features, domain identity, and data encryption technique. Some of the examples of prevalent applications that are making use of ML for web and app development are Snapchat, Tinder, Netflix, etc. For instance, Snapchat uses machine learning which helps in perceiving people’s facial components. Similarly, Netflix uses Linear relapse, Logistic relapse, and other machine learning calculations. These calculations at Netflix track users’ activities to provide personalized content for the viewers. Hence, we speculate Machine learning to completely transform the development process and help web developers take a bigger leap in their career. The Skill up survey also revealed another skill that the developers are keen to learn in the next 12 months; that is Python. Python: the go-to language for both machine learning and web development Python is one of the top languages for both web development as well as machine learning. It has an easy syntax and faster development time which makes it a good choice for the developers. Python contains a vast number of ML libraries such as scikit learn, Keras, Tensorflow, SciPy, and boasts a rich and vibrant machine learning community. The versatile features of the Python language have helped build some of the most robust and popular websites like Instagram, Quora, Youtube, etc. Likewise, the powerful capabilities of machine learning have made our life simpler. It has introduced us to the world of virtual assistants like Siri, Cortana, and face detection technology among others. Machine learning is an incredible breakthrough for businesses and consumers alike. So, if you’re interested in getting counted among the upper echelons of the development world then be ready to expand your toolbelt. Dive into the machine learning world and brace yourself for the opportunities that are to find your way. Asking if good developers can be great entrepreneurs is like asking if moms can excel at work and motherhood What are web developers favorite front-end tools? Packt’s Skill Up report reveals all Developers think managers don’t know enough about technology. And that’s hurting business.

0
0
3054

article-image-googles-translation-tool-is-now-offline-and-more-powerful-than-ever-thanks-to-ai

Pravin Dhandre

13 Jun 2018

2 min read

Google's translation tool is now offline - and more powerful than ever thanks to AI

Pravin Dhandre

13 Jun 2018

2 min read

Google has today rolled out its super fast translation package in offline mode. This will deliver accurate and natural machine translations to users without a live connection to the internet. The team at Google worked for almost more than 2 years to deliver the powerful neural machine translation (NMT) technology to Google’s native Translate applications on smartphones. Using neural nets, the package should provide instant and accurate human-sounding translations for both Android and iOS users. Previously, the offline translation tool worked by breaking down sentences and then translating every individual phrase. However, with AI-powered NMT technology, the app translates the whole sentence swiftly in one. NMT uses millions of translated examples collected from different sources including books, documents, articles, and search engine results. This information is then used to understand how a given sentence can be formulated in a natural way that remains true to its intended context. In addition, this offline feature is surprisingly compact. Each language package is just 35 MB. That means you’ll be able to download it to your phone without using up all of your precious storage. Google says that the package would be very soon rolled out in over 59 languages in next couple of days. It should include European, Indian and several other languages. At present, you will be able to translate the following languages offline: Afrikaans, Albanian, Arabic, Belarusian, Bengali, Bulgarian, Catalan, Chinese, Croatian, Czech, Danish, Dutch, English, Esperanto, Estonian, Filipino, Finnish, French, Galician, Georgian, German, Greek, Gujarati, Haitian, Creole, Hebrew, Hindi, Hungarian, Icelandic, Indonesian, Irish, Italian, Japanese, Kannada, Korean, Latvian, Lithuanian, Macedonian, Malay, Maltese, Marathi, Norwegian, Persian, Polish, Portuguese, Romanian, Russian, Slovak, Slovenian, Spanish, Swahili, Swedish, Tamil, Telugu, Thai, Turkish, Ukrainian, Urdu, Vietnamese and Welsh. To use offline translations in your Google Translate app, browse to Offline Translation settings, tap the symbol next to the language name and the package gets downloaded. To learn more, check out the official announcement at the Google Blog page. FAE (Fast Adaptation Engine): iOlite’s tool to write Smart Contracts using machine translation How to auto-generate texts from Shakespeare writing using deep recurrent neural networks Implement Named Entity Recognition (NER) using OpenNLP and Java

0
0
3693

article-image-ibm-unveils-worlds-fastest-supercomputer-with-ai-capabilities-summit

Natasha Mathur

11 Jun 2018

3 min read

IBM unveils world’s fastest supercomputer with AI capabilities, Summit

Natasha Mathur

11 Jun 2018

3 min read

World’s most powerful and smartest supercomputer, called Summit, has been revealed by IBM and Department of Energy of Oak Ridge National Laboratory. It is capable of performing 200 quadrillion calculations each second, a speed called 200 petaflops which is almost as fast as 7.6 billion people on the planet doing 26 million calculations each second on a basic calculator. Summit was funded back in 2014. It was a part of $325 million Department of Energy program called Coral, but it took quite a few years to develop Summit. Summit is capable of delivering high speed with a new processor, quick storage capacity, internal communications, and a versatile design that can use Artificial Intelligence methods. This makes it quite expensive. Let’s have a look at the features that the Summit Supercomputer entails. Supercomputer and AI integration Dave Turek, vice president of high-performance computing and cognitive systems at IBM said that AI and high-performance computing are not different domains. The two are deeply interconnected to each other which is why Summit will be seen using AI methods for different purposes. Summit will mainly be used for AI development and machine learning. Apart from AI, Oak Ridge will be using Summit to carry out scientific research in subjects such as chemical formula designing, studying links between cancer and genes on a large scale, fusion energy investigation, universe research using astrophysics and simulation of changing Earth’s climate. Super Big Supercomputer Source: Oak Ridge National Laboratory Summit consists of 4,608 interconnected computer servers, housed in huge refrigerator-sized cabinets. It takes up an eighth of an acre, which, to put into perspective is the size of two tennis courts. Peak energy consumption of Summit is 15 megawatts which have the capacity to power more than 7,000 homes. Each server has two IBM Power9 chips at 3.1 GHz. Each chip has 22 cores running in parallel and six Nvidia Tesla V100 GPUs each. Each server consists of 1.6 terabytes of memory and data can be saved at 2.2 terabytes each second on a storage system of 250-petabyte which is 1000 times the storage capacity of a high-end laptop. Supercomputer performance measure Supercomputers’ performance is measured in terms of a benchmark called Linpack in the top 500 list and China's Sunway TaihuLight grabs the highest Linpack score of 93 petaflops. But Turek feels that measuring the value of a machine based on a single figure of merit is not that accurate; rather a machine should be able to scale on real applications. This is IBM’s attempt to exascale in the future. With Summit, IBM is quite convinced that it can reach its goal to build a system capable of performing a quintillion calculations per second (five times that of Summit). Along with Summit, there is also work being done on a less powerful computer, Sierra. Both are scheduled to go online sometime this year. This will take U.S’s arsenal of supercomputers a step forward in terms of competition. Lately, the top spots have been held by other countries, but Summit can become the United States’ chance to stay ahead in the game by retaking the lead. PyCon US 2018 Highlights: Quantum computing, blockchains, and serverless rule! Quantum A.I. : An intelligent mix of Quantum+A.I. Q# 101: Getting to know the basics of Microsoft’s new quantum computing language

0
0
2680

article-image-tensorflow-1-9-0-rc0-release-announced

Pravin Dhandre

08 Jun 2018

2 min read

TensorFlow 1.9.0-rc0 release announced

Pravin Dhandre

08 Jun 2018

2 min read

TensorFlow Community keeps rolling with updates. The first release candidate for next minor version release 1.9.0 is unveiled today with pretty good list of features, improvements and bug fixes. In its previous version 1.8.0 release, the team paid more attention towards supporting GPU memory, running on multiple GPUs and cloud performance. In today’s release, the team were strong in adding support to Keras, gradient estimators and improvement in the layers. Major features and improvements in TensorFlow 1.9.0-rc0: Updated tf.keras to the Keras 2.1.6 API. tfe.Network is deprecated and can be inherited from tf.keras.Model. Added support of core feature columns and losses to gradient boosted trees estimators. The distributions.Bijector API supports broadcasting for Bijectors with new API changes. Layered variable names changed Bug Fixes in TensorFlow 1.9.0-rc0: The DatasetBase::DebugString() method is now const. Added the tf.contrib.data.sample_from_datasets() API for randomly sampling from multiple datasets. Eager Execution and Accelerated Linear Algebra (XLA) fixed. tf.keras.Model.save_weights by default saves in TensorFlow format. TensorFlow Debugger (tfdbg) CLI fixed. Added "constrained_optimization" to tensorflow/contrib. tf.contrib.framework.zero_initializer supports ResourceVariable. tf.contrib.data.make_csv_dataset() supports line breaks in quoted strings. Miscellaneous changes: Added GCS Configuration Ops. MakeIterator signature changed to enable propagating error status. KL divergence for two Dirichlet distributions. More consistent GcsFileSystem behavior for reads past EOF. Added Benchmark for tf.scan in graph and eager modes. Added complex128 support to FFT, FFT2D, FFT3D, IFFT, IFFT2D, and IFFT3D. Support for preventing tf.gradients() from backpropagating through integer tensors. Supports indicator column in boosted trees. Conv3D, Conv3DBackpropInput, Conv3DBackpropFilter now supports arbitrary. LinearOperator[1D,2D,3D]Circulant added to tensorflow.linalg. Allows LinearOperator to broadcast. For the complete list of bug fixes and improvements, you can read TensorFlow’s Github page. You can also download the source code to access all the exciting features of TensorFlow 1.9.0-rc0. Implementing feedforward networks with TensorFlow How TFLearn makes building TensorFlow models easier Distributed TensorFlow: Working with multiple GPUs and servers

0
0
2748

Sunith Shetty

08 Jun 2018

3 min read

Keras 2.2.0 releases!

Sunith Shetty

08 Jun 2018

3 min read

Keras team has announced a new version 2.2.0 with notable features to allow developers to perform deep learning with ease. This release has brought new API changes, new input modes, bug fixes and performance improvements to the high-level neural network API. Keras is a popular neural network API which is capable of running on top of TensorFlow, CNTK or Theano. This Python API is developed with a focus on bringing fast experimentation results, thus taking least possible delay while doing research. It is a highly efficient library allowing easy and fast prototyping, and can even run seamlessly on CPU and GPU. Some of the noteworthy changes available in Keras 2.2.0: New areas of improvements A new API called Model subclassing is added for model definition. They have added a new input mode which provides the ability to call models on TensorFlow tensors directly (however this is applicable to TensorFlow backend only). More improved feature coverage of Keras with the CNTK and Theano backends. Lots of bug fixes and performance improvements are done to the Keras API Now, Keras engine will follow a much more modular structure, thus improving code structure, code health, and reduced test time. Keras modules applications and preprocessing are now externalized to their own repositories such as keras-applications and keras-preprocessing respectively. New API changes MobileNetV2 application added which is available for all backends. Enabled CNTK and Theano support for applications Xception and MobileNet. They have also extended their support for layers SeparableConv1D, SeparableConv2D, as well as the backend methods separable_conv1d and separable_conv2d. which was previously only available for TensorFlow. Now you can feed symbolic tensors to models, with TensorFlow backend. Support for input masking in the TimeDistributed layer. ReLU activation is made easier to configure while retaining easy serialization capabilities by adding an advanced_activation layer ReLU. In order to have a complete list of new API changes, you can visit Github. Breaking changes They have removed the legacy Merge layers and their related functionalities which were the remains of Keras 0. These layers were deprecated in May 2016, with full eviction schedules for August 2017. From now on models from the Keras 0 API using these layers will not be loaded with Keras 2.2.0 and above. The base initializer called truncated_normal now return values that are scaled by ~0.9 thus providing the correct variance value after truncation. For the full list of updates, you can refer the release notes. Read more Why you should use Keras for deep learning Implementing Deep Learning with Keras 2 ways to customize your deep learning models with Keras How to build Deep convolutional GAN using TensorFlow and Keras

0
0
3682

article-image-project-hydrogen-making-apache-spark-play-nice-with-other-distributed-machine-learning-frameworks

Sunith Shetty

06 Jun 2018

5 min read

Project Hydrogen: Making Apache Spark play nice with other distributed machine learning frameworks

Sunith Shetty

06 Jun 2018

5 min read

0
0
4509

article-image-databricks-open-sources-mlflow-simplifying-end-to-end-machine-learning-lifecycle

Pravin Dhandre

06 Jun 2018

2 min read

Databricks open sources MLflow, simplifying end-to-end Machine Learning Lifecycle

Pravin Dhandre

06 Jun 2018

2 min read

Machine Learning has energised software applications with highly accurate predictions thereby upsurging the product demand of tech driven companies. However, while developing such smart applications, numerous machine learning challenges and software development issues are been faced by data scientist and machine learning professionals. Today, Databricks open sources their newly developed framework MLflow, with an aim to simplify their complex machine learning experiments with smart automation and numerous accessibility in deploying your machine learning models across any platform. With MLflow, Machine Learning users can simply standardize their complex processes while building and deploying their machine learning and predictive models. With this framework, data scientists are fueled with lots of automation accessibility through which they can track experiments, package their machine learning codes and manage their models on any of the popular machine learning frameworks. The current platform offers following three components: MLflow Tracking: This component allows you to log codes, data files, config and results. It also allows to query your experiments through which you visualize and compare your experiments and parameters swiftly without much hassle. MLflow Projects: It provides structured format for packaging machine learning codes along with useful API and CLI tools.This allows data scientists to reuse and reproduce their codes and easily chain their projects and workflows together. MLflow Models: It is a standard format for packaging and distributing machine learning models across different downstream tools. Azure ML compatible models, Deploying with Amazon Sagemaker or deploying on a local REST API are some of the examples of distributing models. The current version is just an Alpha release and more features would be added to its full release. To get more details on its core offerings, APIs and command-line interfaces, read the official documentation at mlflow.org. MachineLabs, the browser based machine learning platform, goes open source Microsoft Open Sources ML.NET, a cross-platform machine learning framework Google announces Cloud TPUs on the Cloud Machine Learning Engine (ML Engine)

0
0
2180

Pravin Dhandre

01 Jun 2018

2 min read

Apache Flink 1.5.0 is out

Pravin Dhandre

01 Jun 2018

2 min read

After almost 5 months of hard work by the Flink community, the team is happy to roll out the newest release Apache Flink 1.5.0. This is a major release of the 1.x series featuring advanced capabilities along with over 750+ bugs and issues fixed. Apache Flink is an open-source big data processing framework used for real-time analytics, stream processing and batch processing applications.This framework is capable of delivering fast, efficient, accurate, and high fault tolerance in handling huge massive streams of events. With more than 330 active contributors, Apache Flink is one of the most active stream processing projects of Apache Software Foundation. Key new features and improvements: Rewritten Flink’s Deployment and Process Model Added dynamic support for allocation and release of resources on YARN and Mesos. Simplified deployment on Kubernetes. Requests for job submission, cancellation, job status to the JobManager happen through REST. Broadcast State Connects broadcasted stream such as context data, machine learning models with other streams. Broadcasted states can be checkpointed and restored. Unblocks implementation of “dynamic patterns” feature. Improvements to Flink’s Network Stack Added Credit-based flow control for high throughput. Improved performance by lowering latencies without reduction in throughput. Task-Local State Recovery Keeps copy of the application state on the local disk of each machine. Improved failure recovery. Extending Join Support for SQL and Table API Support for joining of tables on bounded time ranges in both event-time and processing-time. Supports full-history matching similar to standard SQL statements. SQL CLI Client Added SQL CLI client support for processing exploratory queries on data streams. Service added for streaming and batch SQL queries. Various other features and improvements Supports OpenStack’s S3-like file system Improved reading and writing of JSON messages from and to connectors Applications rescaling improved without manual triggers Improved watermarks and latency measures For the complete list of features and improvements, please review the release notes on the official Apache Flink page. Flink Complex Event Processing Top 5 programming languages for crunching Big Data effectively Working with Kafka Streams

0
0
1691

Sunith Shetty

01 Jun 2018

2 min read

Anaconda 5.2 releases!

Sunith Shetty

01 Jun 2018

2 min read

The Anaconda team has announced a new release of Anaconda Distribution 5.2. This new version has brought several new changes in terms of platform changes, user-facing challenges, and backend improvements. Anaconda is a free open-source distribution of Python which allows fast, easier and powerful way to perform data science and machine learning tasks. It is an efficient platform used for carrying out large-scale data processing, scientific computing and more. With over 6 million users, it includes more than 250 data science packages suitable for all major operating systems such as Windows, Linux, and macOS. Every package version is managed by the package management system conda. Some of the noteworthy changes available in Anaconda Distribution 5.2 are: Major highlights More than 100 packages have been updated or added to the new release of Anaconda Distribution 5.2 (Notable Updates includes - Qt v5.9.5, OpenSSL v1.0.2o, NumPy 1.14.3, SciPy v1.1.0, Matplotlib v2.2.2, and Pandas 0.23.0). Now Windows installers control their environment more carefully. Thus even if menu shortcuts fail to get created, it won't lead to a lot of installation issues. macOS pkg installers developer certificate is now updated to Anaconda, Inc. User-facing improvements All default channels now point to repo.anaconda.com instead of repo.continuum.io Now you have more dynamic shortcut working directory behavior thus improving Windows multi-user installations To prevent usability issues, Windows installers now disallow the characters (! % ^ =) in the installation path. Backend improvements Security fixes done for more than 20 packages based on in-depth Common Vulnerabilities and Exposures (CVE) vulnerabilities. Improved behavior of --prune because of history file being updated correctly in the conda-meta directory Windows Installer will now use a trimmed down value for PATH env var, to avoid DLL hell problems with existing software In addition to these, several new changes have been added to all x86 platforms, Linux distributions, and windows distributions. For the complete list of new changes, you can refer the release notes. In case you want to download the new version of Anaconda Distribution 5.2, you can get the file from the official page. Alternatively, you can update the current Anaconda Distribution platform to version 5.2 by using conda update conda followed by conda install anaconda=5.2. 30 common data science terms explained Data science on Windows is a big no 10 Machine Learning Tools to watch in 2018

0
0
5501

article-image-intel-ai-lab-introduces-nlp-architect-library

Sunith Shetty

30 May 2018

3 min read

Intel AI Lab introduces NLP Architect Library

Sunith Shetty

30 May 2018

3 min read

Data forms an integral part of every business or organization which is used to make valuable decisions based on changing circumstances. Natural Language Processing (NLP) is a widely adopted technique used by machines to understand and communicate with humans in human language. This enables human to access, analyze and extract data more intelligently from a huge amount of unstructured data. Intel AI Lab’s team of NLP researchers and developers has introduced NLP Architect, a new open-source Python library. This library can be used as a platform for future research and developing the state-of-the-art deep learning techniques for natural language processing and natural language understanding. Rapid and recent advancements in deep learning and neural network paradigms has led to the growth in NLP domain. This new library offers flexibility in implementing NLP solutions which are packed with the past and ongoing NLP research and development work of Intel AI Lab. NLP Architect overview The current version of NLP Architect offers noteworthy features which form the backbone in terms of research and practical development. All the following models are provided with required training and inference processes: It consists of NLP core models such as BIST and NP chunker that allows powerful extraction of linguistic features for NLP workflow NLU models such as intent extraction (IE), name entity recognition (NER) used for intent-based applications It consists of modules which address semantic understanding Now consists of components which hold a key for conversational AI such as chatbot applications, dialog applications and more End-to-end deep learning applications such as Q&A, reading comprehension and more Source: AI Intel Blog This library of NLP components provides the required functionality to extend NLP solutions with a range of audience. It provides excellent media for analysis and optimization of Intel software and hardware on NLP workloads. In addition to these models, new features such as data pipelines, common functional calls, and utilities related to NLP domain which are majorly used when deploying models, are added. To know more about the updates, you can refer the official Intel AI blog. How NLP Architect can be used You can train models using the provided datasets, configurations and algorithms You can train models based on your own data You can create new models or extend your existing models You can explore various common and not-so-common challenges faced in NLP domain using deep learning models You can optimize and extend the use of state-of-the-art deep learning algorithms You can integrate various modules and utilities from the library to NLP solutions Deep learning frameworks support This repository supports several open source deep learning frameworks such as: Intel Nervana Graph Intel Neon Intel-optimized TensorFlow Dynet Keras Note: We can expect the list of models to update in future. All these models will run with Python 3.5+ If you want to download the open-source Python library or want to contribute to the project by providing valuable feedback, download the code from Github. A complete documentation for all core modules with end-to-end examples can be found in their official page. Intel takes Facebook’s help on AI chip; Cisco uses AI to predict IT services; and more Introducing Intel’s OpenVINO computer vision toolkit for edge computing Facelifting NLP with Deep Learning

0
0
3035

Pravin Dhandre

28 May 2018

2 min read

MariaDB 10.3.7 releases

Pravin Dhandre

28 May 2018

2 min read

Last Friday, the MariaDB Foundation officially announced the general availability of its popular database MariaDB with a newer stable version 10.3.7. This release is considered to be a major and substantial release within 10.3 series of release. MariaDB is fast, scalable and robust with a rich ecosystem of storage engines and plugins for a wide variety of use cases across banks, social sites, ecommerce and many more. Improvement Highlights MyRocks Storage Engine 1.0 now Stable for MariaDB 10.3.7 with high compression ratio. Spider Storage Engine 3.3.13 now Stable, supporting partitioning and XA Transactions. Added two new algorithm options, INSTANT and NOCOPY for operations of data modification and rebuilding clustered index respectively. SSL support for embedded server library when connecting to remote servers. Added new status variables namely, feature_json and feature_system_versioning for monitoring JSON functionality usage and system versioning respectively. Removed InnoDB version number 5.7 in MariaDB 10.3 and onwards. Bugs fixed for ADD COLUMN. Improved ALTER TABLE algorithms along with ALGORITHM=INSTANT and ALGORITHM=NOCOPY. Various performance fixes and code cleanup, including Clean up InnoDB parameter validation Fixed bug that caused the system to hang while shutting down InnoDB. Performance improved in FLUSH TABLES…FOR EXPORT causing no hang. No more support to Debian 7 Wheezy and Fedora 26 in future releases. Users need to update their OS with either Debian 8 “Jessie” or Fedora 27 and onwards. With these added features and performance improvements, MariaDB developers are equipped now to churn out their data into better structured information. Please refer to the release notes and changelog for more details. MySQL 8.0 is generally available with added features Why Oracle is losing the Database Race Neo4j 3.4 aims to make connected data even more accessible

0
0
2302

article-image-postgresql-11-beta-1-is-out

Sunith Shetty

25 May 2018

4 min read

PostgreSQL 11 Beta 1 is out!

Sunith Shetty

25 May 2018

4 min read

PostgreSQL team announces the first beta release of PostgreSQL 11 which contains sneak peek into all the features that will be available in the release candidate of PostgreSQL 11 which is likely to be released in late 2018. The major features are centered around database simplicity, handling large datasets, and various performance bottlenecks. We can expect some minor changes before the final release is out. Since it is still in beta release, it is strongly advised you don't run them in the production environment to avoid any hassle. PostgreSQL is an open source relational database management system which has grown in popularity over the years. With the constant development of more than 30 years, PostgreSQL is one of the popular database used today. It has been called the DBMS of 2017 because of its powerful database management system that offers better reliability, robustness, and performance measures. Some of the noteworthy changes available in PostgreSQL 11 Beta 1: Partitioning improvements Partitioning plays an integral part in splitting a large dataset into smaller pieces in order to carry out complex operations with ease. PostgreSQL 11 contains several new features and improvements to working with data in partitions: New feature, hash partitioning, allows you to partition using a hash key You can now use UPDATE statements to a partition key in order to move the affected rows to the appropriate partitions PostgreSQL 11 supports enhanced partition elimination during query processing and execution thus leading to improved SELECT query performance Complete support for PRIMARY KEY, FOREIGN KEY, triggers, and indexes on partitions A new feature has been added which allows the query to distribute grouping and aggregation to partitioned tables before the final aggregation. However, in order to enable the settings, you need to set enable_partitionwise_aggregate = on in your configuration file, since it is disabled by default. Parallelism improvements New features have been added to build a smooth parallel query infrastructure to manage and carry out workloads efficiently and effectively thus providing significant performance enhancements. We now have parallelized hash joins and CREATE INDEX for B-tree indexes We can use parallelized features on certain queries with UNION SQL stored procedures A new feature SQL stored procedures is introduced by the PostgreSQL team which allows users to use embedded transactions such as BEGIN, COMMIT/ROLLBACK and more within a procedure. Just-In-Time compilation Now you can optimize the execution of code, and operations; and even make required changes during the run time. Thus it stands out as a perfect framework which gives you a leeway to allow future optimizations in the workflow. In case you are building PostgreSQL 11 from source, you can enable JIT compilation using the --with-llvm flag. Window functions In PostgreSQL 11, window functions will support all options in SQL:2011 standard SCRAM authentication PostgreSQL 11 supports channel binding for SCRAM authentication, thus providing the required security feature to prevent man-in-the-middle attacks. PostgreSQL team has upgraded this feature since SCRAM authentication was already available. This was used to improve the storage and transmission of passwords on the basis of standard protocol. Simplicity and user experience improvements Although PostgreSQL provides a healthy set of features, not all of them can be easy-to-use in development and production environments. The PostgreSQL team has therefore brought many new improvements to better the user experience. Now you can quit the PostgreSQL command-line (psql) using keywords like quit and exit. Additional improvements and features Many other new improvements and features have been added to the PostgreSQL 11. You can refer the release notes for complete list of new and changed features in the roadmap. If you want to contribute to the project and want to test this new release in order to find bugs and issues, download PostgreSQL 11 Beta 1, from their official page. You can find existing open issues in the PostgreSQL wiki. In case you want to report any bugs or issues you can use report bugs form available on the PostgreSQL website. How to perform data partitioning in PostgreSQL 10 New updates to Microsoft Azure services for SQL Server, MySQL, and PostgreSQL 2018 is the year of graph databases. Here’s why

0
0
2580

article-image-amazon-is-selling-facial-recognition-technology-to-police

Richard Gall

23 May 2018

4 min read

Amazon is selling facial recognition technology to police

Richard Gall

23 May 2018

4 min read

The American Civil Liberties Union (ACLU) has revealed that Amazon has been selling its facial recognition software, called Rekognition, to a number of law enforcement agencies in the U.S. Using a freedom of information requests, the ACLU obtained correspondence between the respective departments and Amazon. According to the ACLU, Rekognition is a dangerous step towards a surveillance state. It could, the organization argues, lead to serious infringement on civil liberties. Here's what ACLU had to say in a post published on Tuesday 22 May: People should be free to walk down the street without being watched by the government. By automating mass surveillance, facial recognition systems like Rekognition threaten this freedom, posing a particular threat to communities already unjustly targeted in the current political climate. Once powerful surveillance systems like these are built and deployed, the harm will be extremely difficult to undo. How is Rekognition currently being used? Two U.S. police departments are using Rekognition. In Oregon, the Washington County Sheriff's Office is using the facial recognition tool to identify persons of interest from a database of 300,000 mugshots. This is a project that has been underway for some time. Chris Adzima, Senior Information Systems Analyst for the Washington County Sheriff’s Office, wrote a guest post on the AWS website outlining how they were using Rekognition in June 2017. Once the architecture was in place, the team built a mobile app to make the technology usable for officers. In Orlando, meanwhile, police have been using AWS for 'consulting and advisory services.' They are seeking to implement Rekognition in a project referred to in the documentation as 'Orlando Safety Video POC'. Orlando City police are paying $39,000 for AWS' time on the project. Civil liberties organizations pen an open letter to Jeff Bezos The ACLU, along with a number of other organizations, including the Electronic Frontier Foundation and Data for Black Lives, penned an open letter to Jeff Bezos to express their concern. In an appeal to Amazon's past commitment to civil liberties, the letter stated: In the past, Amazon has opposed secret government surveillance. And you have personally supported First Amendment freedoms and spoken out against the discriminatory Muslim Ban. But Amazon’s Rekognition product runs counter to these values. As advertised, Rekognition is a powerful surveillance system readily available to violate rights and target communities of color. The letter presents an impassioned plea for Amazon to consider the way in which it is its complicit with government agencies. It also offers a serious warning about the potential consequences of facial recognition technology in the hands of law enforcement. Amazon defends collaborating with police Amazon has been quick to defend itself. In a statement emailed to various news organizations, the company said "Our quality of life would be much worse today if we outlawed new technology because some people could choose to abuse the technology. Imagine if customers couldn’t buy a computer because it was possible to use that computer for illegal purposes? Like any of our AWS services, we require our customers to comply with the law and be responsible when using Amazon Rekognition.” However, the key issue with Amazon's statement is that the analogy with personal computers doesn't quite hold. Individuals aren't responsible for maintaining the law, and neither do they hold the same power that law enforcement agencies do. Technology might change how individuals behave, but that behavior must still comply with the law. The current scenario is a little different; the concern is around how technology might actually change the way the law functions. There isn't, strictly speaking at least, any way of governing how that happens. Whatever you make of Amazon's work with law enforcement, it's clear that we are about to enter a new era of disruption and innovation in public institutions. For some people, collaboration between public and private realms opens up plenty of opportunities. But there are many dangers that must be monitored and challenged. Read next: Top 10 Tools for Computer Vision [Link] Admiring the many faces of Facial Recognition with Deep Learning [Link]

0
0
3025

Tech News - Data

NVIDIA brings new deep learning updates at CVPR conference

A new geometric deep learning extension library for PyTorch releases!

The most valuable skills for web developers to learn in 2018

Google's translation tool is now offline - and more powerful than ever thanks to AI

IBM unveils world’s fastest supercomputer with AI capabilities, Summit

TensorFlow 1.9.0-rc0 release announced

Keras 2.2.0 releases!

Project Hydrogen: Making Apache Spark play nice with other distributed machine learning frameworks

Databricks open sources MLflow, simplifying end-to-end Machine Learning Lifecycle

Apache Flink 1.5.0 is out

Trending Topics

Anaconda 5.2 releases!

Intel AI Lab introduces NLP Architect Library

MariaDB 10.3.7 releases

PostgreSQL 11 Beta 1 is out!

Amazon is selling facial recognition technology to police