Data | 0 articles | Tech News, Tutorials & Expert Insights

article-image-google-open-sources-gpipe-a-pipeline-parallelism-library-to-scale-up-deep-neural-network-training

05 Mar 2019

3 min read

Google open-sources GPipe, a pipeline parallelism Library to scale up Deep Neural Network training

05 Mar 2019

Google AI research team announced that it’s open sourcing GPipe, a distributed machine learning library for efficiently training Large-scale Deep Neural Network Models, under the Lingvo Framework, yesterday. GPipe makes use of synchronous stochastic gradient descent and pipeline parallelism for training. It divides the network layers across accelerators and pipelines execution to achieve high hardware utilization. GPipe also allows researchers to easily deploy accelerators to train larger models and to scale the performance without tuning hyperparameters. Google AI researchers had also published a paper titled “GPipe: Efficient Training of Giant Neural Networks using Pipeline Parallelism" last year in December. In the paper, researchers demonstrated the use of pipeline parallelism to scale up deep neural networks to overcome the memory limitation on current accelerators. Let’s have a look at major highlights of GPipe. GPipe helps with maximizing the memory and efficiency GPipe helps with maximizing the memory allocation for model parameters. Researchers conducted experiments on Cloud TPUv2s. Each of these Cloud TPUv2s consists of 8 accelerator cores and 64 GB memory (8 GB per accelerator). Generally, without GPipe, a single accelerator is able to train up to 82 million model parameters because of the memory limitations, however, GPipe was able to bring down the immediate activation memory from 6.26 GB to 3.46GB on a single accelerator. Researchers also measured the effects of GPipe on the model throughput of AmoebaNet-D to test its efficiency. Researchers found out that there was almost a linear speedup in training. GPipe also enabled 8 billion parameter Transformer language models on 1024-token sentences using speedup of 11x. Speedup of AmoebaNet-D using GPipe Putting the accuracy of GPipe to test Researchers used GPipe to verify the hypothesis that scaling up existing neural networks can help achieve better model quality. For this experiment, an AmoebaNet-B with 557 million model parameters and input image size of 480 x 480 was trained on the ImageNet ILSVRC-2012 dataset. Researchers observed that the model was able to reach 84.3% top-1 / 97% top-5 single-crop validation accuracy without the use of any external data. Researchers also ran the transfer learning experiments on the CIFAR10 and CIFAR100 datasets, where they observed that the giant models improved the best published CIFAR-10 accuracy to 99% and CIFAR-100 accuracy to 91.3%. “We are happy to provide GPipe to the broader research community and hope it is a useful infrastructure for efficient training of large-scale DNNs”, say the researchers. For more information, check out the official GPipe Blog post. Google researchers propose building service robots with reinforcement learning to help people with mobility impairment Google AI researchers introduce PlaNet, an AI agent that can learn about the world using only images Researchers release unCaptcha2, a tool that uses Google’s speech-to-text API to bypass the reCAPTCHA audio challenge

0
0
6851

article-image-microsoft-announces-the-first-public-preview-of-sql-server-2019-at-ignite-2018

Amey Varangaonkar

25 Sep 2018

2 min read

Microsoft announces the first public preview of SQL Server 2019 at Ignite 2018

Amey Varangaonkar

25 Sep 2018

2 min read

Microsoft made several key announcements at their Ignite 2018 event, which began yesterday in Orlando, Florida. The biggest announcement of them all was the public preview availability of SQL Server 2019. With this new release of SQL Server, businesses will be able to manage their relational and non-relational data workloads in a single database management system. What we can expect in SQL Server 2019 Microsoft SQL Server 2019 will run either on-premise, or on the Microsoft Azure stack Microsoft announced the Azure SQL Database Managed Instance, which will allow businesses to port their database to the cloud without any code changes Microsoft announced new database connectors that will allow organizations to integrate SQL Server with other databases such as Oracle, Cosmos DB, MongoDB and Teradata SQL Server 2019 will get built-in support for popular Open Source Big Data processing frameworks such as Apache Spark and Apache Hadoop SQL Server 2019 will have smart machine learning capabilities with support for SQL Server Machine Learning services and Spark Machine Learning Microsoft also announced support for Big Data clusters managed through Kubernetes - the Google-incubated container orchestration system With organizations slowly moving their operations to the cloud, Microsoft seems to have hit the jackpot with the integration of SQL Server and Azure services. Microsoft has claimed businesses can save upto 80% of their operational costs by moving their SQL database to Azure. Also, given the rising importance of handling Big Data workloads efficiently, SQL Server 2019 will now be able to ingest, process and analyze Big Data on its own with built-in capabilities of Apache Spark and Hadoop - the world’s leading Big Data processing frameworks. Although Microsoft hasn’t hinted at the official release date yet, it is expected that SQL Server 2019 will be generally available in the next 3-5 months. Of course, the duration can be extended or accelerated depending on the feedback received from the tool’s early adopters. You can try the public preview of SQL Server 2019 by downloading it from the official Microsoft website. Read more Microsoft announces the release of SSMS, SQL Server Management Studio 17.6 New updates to Microsoft Azure services for SQL Server, MySQL, and PostgreSQL Troubleshooting in SQL Server

0
0
6807

article-image-red-hat-drops-mongodb-over-concerns-related-to-its-server-side-public-license-sspl

Natasha Mathur

17 Jan 2019

3 min read

Red Hat drops MongoDB over concerns related to its Server Side Public License (SSPL)

Natasha Mathur

17 Jan 2019

3 min read

It was last year in October when MongoDB announced that it’s switching to Server Side Public License (SSPL). Now, the news of Red Hat removing MongoDB from its Red Hat Enterprise Linux and Fedora over its SSPL license has been gaining attention. Tom Callaway, University outreach Team lead, Red Hat, mentioned in a note, earlier this week, that Fedora does not consider MongoDB’s Server Side Public License v1 (SSPL) as a Free Software License. He further explained that SSPL is “intentionally crafted to be aggressively discriminatory towards a specific class of users. To consider the SSPL to be "Free" or "Open Source" causes that shadow to be cast across all other licenses in the FOSS ecosystem, even though none of them carry that risk”. The first instance of Red Hat removing MongoDB happened back in November 2018 when its RHEL 8.0 beta was released. RHEL 8.0 beta release notes explicitly mentioned that the reason behind the removal of MongoDB in RHEL 8.0 beta is because of SSPL. Apart from Red Hat, Debian also dropped MongoDB from its Debian archive last month due to similar concerns over MongoDB’s SSPL. “For clarity, we will not consider any other version of the SSPL beyond version one. The SSPL is clearly not in the spirit of the DFSG (Debian’s free software guidelines), let alone complimentary to the Debian's goals of promoting software or user freedom”, mentioned Chirs Lamb, Debian Project Leader. Also, Debian developer, Apollon Oikonomopoulos, mentioned that MongoDB 3.6 and 4.0 will be supported longer but that Debian will not be distributing any SSPL-licensed software. He also mentioned how keeping the last AGPL-licensed version (3.6.8 or 4.0.3) without the ability to “cherry-pick upstream fixes is not a viable option”. That being said, MongoDB 3.4 will be a part of Debian as long as it is AGPL-licensed (MongoDB’s previous license). MongoDB’s decision to move to SSPL license was due to cloud providers exploiting its open source code. SSPL clearly specifies an explicit condition that companies wanting to use, review, modify or redistribute MongoDB as a service, would have to open source the software that they’re using. This, in turn, led to a debate among the industry and the open source community, as they started to question whether MongoDB is open source anymore. https://twitter.com/mjasay/status/1082428001558482944 Also, MongoDB’s adoption SSPL forces companies to either go open source or choose MongoDB’s commercial products. “It seems clear that the intent of the license author is to cause Fear, Uncertainty, and Doubt towards commercial users of software under that license” mentioned Callaway. https://twitter.com/mjasay/status/1083853227286683649 MongoDB acquires mLab to transform the global cloud database market and scale MongoDB Atlas MongoDB Sharding: Sharding clusters and choosing the right shard key [Tutorial] MongoDB 4.0 now generally available with support for multi-platform, mobile, ACID transactions and more

0
0
6690

article-image-introducing-postgrest-a-rest-api-for-any-postgresql-database-written-in-haskell

Bhagyashree R

04 Nov 2019

3 min read

Introducing PostgREST, a REST API for any PostgreSQL database written in Haskell

Bhagyashree R

04 Nov 2019

3 min read

Written in Haskell, PostgREST is a standalone web server that enables you to turn your existing PostgreSQL database into a RESTful API. It offers you a much “cleaner, more standards-compliant, faster AP than you are likely to write from scratch.” The PostgREST documentation describes it as an “alternative to manual CRUD programming.” Explaining the motivation behind this tool, the documentation reads, “Writing business logic often duplicates, ignores or hobbles database structure. Object-relational mapping is a leaky abstraction leading to slow imperative code. The PostgREST philosophy establishes a single declarative source of truth: the data itself.” Performant by design In terms of performance, PostgREST shows subsecond response times for up to 2000 requests/sec on Heroku free tier. The main contributor to this impressive performance is its Haskell implementation using the Warp HTTP server. To maintain fast response times, it delegates most of the calculation part to the database including serializing JSON responses directly in SQL, data validation, and more. Along with that, it takes the help of the Hasql library to efficiently use the database. A single declarative source of truth for security PostgREST is responsible for handling authentication via JSON Web Tokens. You can also build other forms of authentication on top of the JWT primitive. It delegates authorization to the role information defined in the database to ensure there is a single declarative source of truth for security. Data integrity PostgREST does not rely on an Object Relational Mapper (ORM) and custom imperative coding. Instead, developers need to put declarative constraints directly into their database preventing any kind of data corruption. In a Hacker News discussion, many users praised the tool. “I think PostgREST is the first big tool written in Haskell that I’ve used in production. From my experience, it’s flawless. Kudos to the team,” a user commented. Some others also expressed that using this tool for systems in production can further complicate things. A user added, “Somebody in our team put this on production. I guess this solution has some merits if you need something quick, but in the long run it turned out to be painful. It's basically SQL over REST. Additionally, your DB schema becomes your API schema and that either means you force one for the purposes of the other or you build DB views to fix that.” You can read about PostgREST on its official website. Also, check out its GitHub repository. After PostgreSQL, DigitalOcean now adds MySQL and Redis to its managed databases’ offering Amazon Aurora makes PostgreSQL Serverless generally available PostgreSQL 12 progress update

0
0
6628

article-image-amoebanets-googles-new-evolutionary-automl

Savia Lobo

16 Mar 2018

2 min read

AmoebaNets: Google’s new evolutionary AutoML

Savia Lobo

16 Mar 2018

2 min read

In order to detect objects within an image, artificial neural networks require careful design by experts over years of difficult research. They later address one specific task, such as to find what's in a photograph, to call a genetic variant, or to help diagnose a disease. Google believes one approach to generate these ANN architectures is through the use of evolutionary algorithms. So, today Google introduced AmoebaNets, an evolutionary algorithm that achieves state-of-the-art results for datasets such as ImageNet and CIFAR-10. Google offers AmoebaNets as an answer to questions such as, By using the computational resources to programmatically evolve image classifiers at unprecedented scale, can one achieve solutions with minimal expert participation? How good can today's artificially-evolved neural networks be? These questions were addressed through the two papers: Large-Scale Evolution of Image Classifiers,” presented at ICML 2017. In this paper, the authors have set up an evolutionary process with simple building blocks and trivial initial conditions. The idea was to "sit back" and let evolution at scale do the work of constructing the architecture. Regularized Evolution for Image Classifier Architecture Search (2018). This paper includes a scaled up computation using Google's new TPUv2 chips. This combination of modern hardware, expert knowledge, and evolution worked together to produce state-of-the-art models on CIFAR-10 and ImageNet, two popular benchmarks for image classification. One important feature of the evolutionary algorithm (AmoebaNets) that the team used in their second paper is a form of regularization, which means: Instead of letting the worst neural networks die, they remove the oldest ones — regardless of how good they are. This improves robustness to changes in the task being optimized and tends to produce more accurate networks in the end. Since weight inheritance is not allowed, all networks must train from scratch. Therefore, this form of regularization selects for networks that remain good when they are re-trained. These models achieve state-of-the-art results for CIFAR-10 (mean test error = 2.13%), mobile-size ImageNet (top-1 accuracy = 75.1% with 5.1 M parameters) and ImageNet (top-1 accuracy = 83.1%). Read more about AmoebaNets on Google Research Blog

0
0
6604

article-image-alibabas-chipmaker-launches-open-source-risc-v-based-xuantie-910-processor-for-5g-ai-iot-and-self-driving-applications

Vincy Davis

26 Jul 2019

4 min read

Alibaba’s chipmaker launches open source RISC-V based ‘XuanTie 910 processor’ for 5G, AI, IoT and self-driving applications

Vincy Davis

26 Jul 2019

4 min read

Launched in 2018, Alibaba’s chip subsidiary, Pingtouge made a major announcement yesterday. Pingtouge is launching its first product - chip processor XuanTie 910 using the open-source RISC-V instruction set architecture. The XuanTie 910 processor is expected to reduce the costs of related chip production by more than 50%, reports Caixin Global. XuanTie 910, also known as T-Head, will soon be available in the market for commercial use. Pingtouge will also be releasing some of XuanTie 910’s codes on Github for free to help the global developer community to create innovative applications. No release dates have been revealed yet. What are the properties of the XuanTie 910 processor? The XuanTie 910 16-core processor has 7.1 Coremark/MHz and its main frequency can achieve 2.5GHz. This processor can be used to manufacture high-end edge-based microcontrollers (MCUs), CPUs, and systems-on-chip (SOC). It can be used in applications like 5G telecommunication, artificial intelligence (AI), and autonomous driving. XuanTie 910 processor gives 40% increased performance over the mainstream RISC-V instructions and also a 20% increase in terms of instructions. According to Synced, Xuantie 910 has two unconventional properties: It has a 2-stage pipelined out-of-order triple issue processor with two memory accesses per cycle. The processors computing, storage and multi-core capabilities are superior due to an increased extension of instructions. Xuantie 910 can extend more than 50 instructions than RISC-V. Last month, The Verge reported that an internal ARM memo has instructed its staff to stop working with Huawei. With the US blacklisting China’s telecom giant Huawei, and also banning any American company from doing business with them, it seems that ARM is also following the American strategy. Although ARM is based in U.K. and is owned by the Japanese SoftBank group, it does have an “US origin technology”, as claimed in the internal memo. This may be one of the reasons why Alibaba is increasing its efforts in developing RISC-V, so that Chinese tech companies can become independent from Western technologies. A Xuantie 910 processor can assure Chinese companies of a stable future, with no fear of it being banned by Western governments. Other than being cost-effective, RISC-V also has other advantages like more flexibility compared to ARM. With complex licence policies and high power prospect, it is going to be a challenge for ARM to compete against RISC-V and MIPS (Microprocessor without Interlocked Pipeline Stages) processors. A Hacker News user comments, “I feel like we (USA) are forcing China on a path that will make them more competitive long term.” Another user says, “China is going to be key here. It's not just a normal market - China may see this as essential to its ability to develop its technology. It's Made in China 2025 policy. That's taken on new urgency as the west has started cutting China off from western tech - so it may be normal companies wanting some insurance in case intel / arm cut them off (trade disputes etc) AND the govt itself wanting to product its industrial base from cutoff during trade disputes” Some users also feel that it is technology that wins when two big economies continue bringing up innovative technologies. A comment on Hacker News reads, “Good to see development from any country. Obviously they have enough reason to do it. Just consider sanctions. They also have to protect their own market. Anyone that can afford it, should do it. Ultimately it is a good thing from technology perspective.” Not all US tech companies are wary of partnering with Chinese counterparts. Two days ago, Salesforce, an American cloud-based software company announced a strategic partnership with Alibaba. This aims to help Salesforce localize their products in mainland China, Hong Kong, Macau, and Taiwan. This will enable Salesforce customers to market, sell, and operate through services like Alibaba Cloud and Tmall. Winnti Malware: Chinese hacker group attacks major German corporations for years, German public media investigation reveals The US Justice Department opens a broad antitrust review case against tech giants Salesforce is buying Tableau in a $15.7 billion all-stock deal

0
0
6591

article-image-how-deep-neural-networks-can-improve-speech-recognition-and-generation

Sugandha Lahoti

02 Feb 2018

7 min read

How Deep Neural Networks can improve Speech Recognition and generation

Sugandha Lahoti

02 Feb 2018

7 min read

While watching your favorite movie or TV show, you must have found it difficult to sometimes decipher what the characters are saying, especially if they are talking really fast, or well, you’re seeing a show in the language you don’t know. You quickly add subtitles and voila, the problem is solved. But, do you know how these subtitles work? Instead of a person writing them, a computer automatically recognizes speech and the dialogues of the characters and generates scripts. However, this is just a trivial example of what computers and neural networks can do in the field of speech understanding and generation. Today, we’re gonna talk about the achievements of deep neural networks to improve the ability of our computing systems to understand and generate human speech. How traditional speech recognition systems work Traditionally speech recognition models used classification algorithms to arrive at a distribution of possible phonemes for each frame. These classification algorithms were based on highly specialized features such as MFCC. Hidden Markov Models (HMM) were used in the decoding phase. This model was accompanied with a pre-trained language model and was used to find the most likely sequence of phones that can be mapped to output words. With the emergence of deep learning, neural networks were used in many aspects of speech recognition such as phoneme classification, isolated word recognition, audiovisual speech recognition, audio-visual speaker recognition and speaker adaptation. Deep learning enabled the development of Automatic Speech Recognition (ASR) systems. These ASR systems require separate models, namely acoustic model (AM), a pronunciation model (PM) and a language model (LM). The AM is typically trained to recognize context-dependent states or phonemes, by bootstrapping from an existing model which is used for alignment. The PM maps the sequences of phonemes produced by the AM into word sequences. Word sequences are scored using LM trained on large amounts of text data, which estimate probabilities of word sequences. However, training independent components added complexities and was suboptimal compared to training all components jointly. This called for developing end-to-end systems in the ASR community, those which attempt to learn the separate components of an ASR jointly as a single system. A single system Speech recognition model The end-to-end trained neural networks can essentially recognize speech, without using an external pronunciation lexicon, or a separate language model. End-to-end trained systems can directly map the input acoustic speech signal to word sequences. In such sequence-to-sequence models, the AM, PM, and LM are trained jointly in a single system. Since these models directly predict words, the process of decoding utterances is also greatly simplified. The end-to-end ASR systems do not require bootstrapping from decision trees or time alignments generated from a separate system. Thereby making the training of such models simpler than conventional ASR systems. There are several sequence-to-sequence models including connectionist temporal classification (CTC), and recurrent neural network (RNN) transducer, an attention-based model etc. CTC models are used to train end-to-end systems that directly predict grapheme sequences. This model was proposed by Graves et al. as a way of training end-to-end models without requiring a frame-level alignment of the target labels for a training statement. This basic CTC model was extended by Graves to include a separate recurrent LM component, in a model referred to as the recurrent neural network (RNN) transducer. The RNN transducer augments the encoder network from the CTC model architecture with a separate recurrent prediction network over the output symbols. Attention-based models are also a type of end-to-end sequence models. These models consist of an encoder network, which maps the input acoustics into a higher-level representation. They also have an attention-based decoder that predicts the next output symbol based on the previous predictions. A schematic representation of various sequence-to-sequence modeling approaches Google’s Listen-Attend-Spell (LAS) end-to-end architecture is one such attention-based model. Their end-to-end system achieves a word error rate (WER) of 5.6%, which corresponds to a 16% relative improvement over a strong conventional system which achieves a 6.7% WER. Additionally, the end-to-end model used to output the initial word hypothesis, before any hypothesis rescoring, is 18 times smaller than the conventional model. These sequence-to-sequence models are comparable with traditional approaches on dictation test sets. However, the traditional models outperform end-to-end systems on voice-search test sets. Future work is being done on building optimal models for voice-search tests as well. More work is also expected in building multi-dialect and multi-lingual systems. So that data for all dialects/languages can be combined to train one network, without the need for a separate AM, PM, and LM for each dialect/language. Enough with understanding speech. Let’s talk about generating it Text-to-speech (TTS) conversion, i.e generating natural sounding speech from text, or allowing people to converse with machines has been one of the top research goals in the present times. Deep Neural networks have greatly improved the overall development of a TTS system, as well as enhanced individual pieces of such a system. In 2012, Google first used Deep Neural Networks (DNN) instead of Gaussian Mixture Model (GMMs), which were then used as the core technology behind TTS systems. DNNs assessed sounds at every instant in time with increased speech recognition accuracy. Later, better neural network acoustic models were built using CTC and sequence discriminative training techniques based on RNNs. Although being blazingly fast and accurate, these TTS systems were largely based on concatenative TTS, where a very large database of short speech fragments was recorded from a single speaker and then recombined to form complete utterances. This led to the development of parametric TTS, where all the information required to generate the data was stored in the parameters of the model, and the contents and characteristics of the speech were controlled via the inputs to the model. WaveNet further enhanced these parametric models by directly modeling the raw waveform of the audio signal, one sample at a time. WaveNet yielded more natural-sounding speech using raw waveforms and was able to model any kind of audio, including music. Baidu then came with their Deep Voice TTS system constructed entirely from deep neural networks. Their system was able to do audio synthesis in real-time, giving up to 400X speedup over previous WaveNet inference implementations. Google, then released Tacotron, an end-to-end generative TTS model that synthesized speech directly from characters. Tacotron was able to achieve a 3.82 mean opinion score (MOS), outperforming the traditional parametric system in terms of speech naturalness. Tacotron was also considerably faster than sample-level autoregressive methods because of its ability to generate speech at the frame level. Most recently, Google has released Tacotron 2 which took inspiration from past work on Tacotron and WaveNet. It features a tacotron style, recurrent sequence-to-sequence feature prediction network that generates mel spectrograms. Followed by a modified version of WaveNet which generates time-domain waveform samples conditioned on the generated mel spectrogram frames. The model achieved a MOS of 4.53 compared to a MOS of 4.58 for professionally recorded speech. Deep Neural Networks have been a strong force behind the developments of end-to-end speech recognition and generation models. Although these end-to-end models have compared substantially well against the classical approaches, more work is to be done still. As of now, end-to-end speech models cannot process speech in real time. Real-time speech processing is a strong requirement for latency-sensitive applications such as voice search. Hence more progress is expected in such areas. Also, end-to-end models do not give expected results when evaluated on live production data. There is also difficulty in learning proper spellings for rarely used words such as proper nouns. This is done quite easily when a separate PM is used. More efforts will need to be made to address these challenges as well.

0
0
6552

article-image-hadoop-3-2-0-released-with-support-for-node-attributes-in-yarn-hadoop-submarine-and-more

Amrata Joshi

24 Jan 2019

3 min read

Hadoop 3.2.0 released with support for node attributes in YARN, Hadoop submarine and more

Amrata Joshi

24 Jan 2019

3 min read

The team at Apache Hadoop released Apache Hadoop 3.2.0, an open source software platform for distributed storage and for processing of large data sets. This version is the first in the 3.2 release line and is not generally available or production ready, yet. What’s new in Hadoop 3.2.0? Node attributes support in YARN This release features Node Attributes that help in tagging multiple labels on the nodes based on their attributes. It further helps in placing the containers based on the expression of these labels. It is not associated with any queue and hence there is no need to queue resource planning and authorization for attributes. Hadoop submarine on YARN This release comes with Hadoop Submarine that enables data engineers for developing, training and deploying deep learning models in TensorFlow on the same Hadoop YARN cluster where data resides. It also allows jobs for accessing data/models in HDFS (Hadoop Distributed File System) and other storages. It supports user-specified Docker images and customized DNS name for roles such as tensorboard.$user.$domain:6006. Storage policy satisfier Storage policy satisfier supports HDFS applications to move the blocks between storage types as they set the storage policies on files/directories. It is also a solution for decoupling storage capacity from compute capacity. Enhanced S3A connector This release comes with support for an enhanced S3A connector, including better resilience to throttled AWS S3 and DynamoDB IO. ABFS filesystem connector It supports the latest Azure Datalake Gen2 Storage. Major improvements jdk1.7 profile has been removed from hadoop-annotations module. Redundant logging related to tags have been removed from configuration. ADLS connector has been updated to use the current SDK version (2.2.7). This release includes LocalizedResource size information in the NM download log for localization. This version of Apache Hadoop comes with ability to configure auxiliary services from HDFS-based JAR files. This release comes with the ability to specify user environment variables, individually. The debug messages in MetricsConfig.java have been improved. Capacity scheduler performance metrics have been added. This release comes with added support for node labels in opportunistic scheduling. Major bug fixes The issue with logging for split-dns multihome has been resolved. The snapshotted encryption zone information in this release is immutable. A shutdown routine has been added in HadoopExecutor for ensuring clean shutdown. Registry entries have been deleted from ZK on ServiceClient. The javadoc of package-info.java has been improved. NPE in AbstractSchedulerPlanFollower has been fixed. To know more about this release, check out the release notes on Hadoop’s official website. Why did Uber created Hudi, an open source incremental processing framework on Apache Hadoop? Uber’s Marmaray, an Open Source Data Ingestion and Dispersal Framework for Apache Hadoop Setting up Apache Druid in Hadoop for Data visualizations [Tutorial]

0
0
6502

article-image-facebook-ai-introduces-aroma-a-new-code-recommendation-tool-for-developers

Natasha Mathur

09 Apr 2019

3 min read

Facebook AI introduces Aroma, a new code recommendation tool for developers

Natasha Mathur

09 Apr 2019

3 min read

Facebook AI team announced a new tool, called Aroma, last week. Aroma is a code-to-code search and recommendation tool that makes use of machine learning (ML) to simplify the process of gaining insights from big codebases. Aroma allows engineers to find common coding patterns easily by making a search query without any need to manually browse through code snippets. This, in turn, helps save time in their development workflow. So, in case a developer has written code but wants to see how others have implemented the same code, he can run the search query to find similar code in related projects. After the search query is run, results for codes are returned as code ‘recommendations’. Each code recommendation is built from a cluster of similar code snippets that are found in the repository. Aroma is a more advanced tool in comparison to the other traditional code search tools. For instance, Aroma performs the search on syntax trees. Instead of looking for string-level or token-level matches, Aroma can find instances that are syntactically similar to the query code. It can then further highlight the matching code by cutting down the unrelated syntax structures. Aroma is very fast and creates recommendations within seconds for large codebases. Moreover, Aroma’s core algorithm is language-agnostic and can be deployed across codebases in Hack, JavaScript, Python, and Java. How does Aroma work? Aroma follows a three-step process to make code recommendations, namely, Feature-based search, re-ranking and clustering, and intersecting. For feature-based search, Aroma indexes the code corpus as a sparse matrix. It parses each method in the corpus and then creates its parse tree. It further extracts a set of structural features from the parse tree of each method. These features capture information about variable usage, method calls, and control structures. Finally, a sparse vector is created for each method according to its features and then the top 1,000 method bodies whose dot products are highest are retrieved as the candidate set for the recommendation. Aroma In the case of re-ranking and clustering, Aroma first reranks the candidate methods by their similarity to the query code snippet. Since the sparse vectors contain only abstract information about what features are present, the dot product score is an underestimate of the actual similarity of a code snippet to the query. To eliminate that, Aroma applies ‘pruning’ on the method syntax trees. This helps to discard the irrelevant parts of a method body and helps retain all the parts best match the query snippet. This is how it reranks the candidate code snippets by their actual similarities to the query. Further ahead, Aroma runs an iterative clustering algorithm to find clusters of code snippets similar to each other and consist of extra statements useful for making code recommendations. In the case of intersecting, a code snippet is taken first as the “base” code and then ‘pruning’ is applied iteratively on it with respect to every other method in the cluster. The remaining code after the pruning process is the code which is common among all methods, making it a code recommendation. “We believe that programming should become a semiautomated task in which humans express higher-level ideas and detailed implementation is done by the computers themselves”, states Facebook AI team. For more information, check out the official Facebook AI blog. How to make machine learning based recommendations using Julia [Tutorial] Facebook AI open-sources PyTorch-BigGraph for faster embeddings in large graphs Facebook AI research and NYU school of medicine announces new open-source AI models and MRI dataset

0
0
6465

article-image-intelligent-edge-analytics-7-ways-machine-learning-is-driving-edge-computing-adoption-in-2018

Melisha Dsouza

21 Aug 2018

9 min read

Intelligent Edge Analytics: 7 ways machine learning is driving edge computing adoption in 2018

Melisha Dsouza

21 Aug 2018

9 min read

Edge services and edge computing have been in talks since at least the 90s. When Edge computing is extended to the cloud it can be managed and consumed as if it were local infrastructure. The logic is simple. It’s the same as how humans find it hard to interact with infrastructure that is too far away. Edge Analytics is the exciting area of data analytics that is gaining a lot of attention these days. While traditional analytics, answer questions like what happened, why it happened, what is likely to happen and options on what you should do about it Edge analytics is data analytics in real time. It deals with the operations performed on data at the edge of a network either at or close to a sensor, a network switch or some other connected device. This saves time and overhead issues as well as latency problems. As they rightly say, Time is money! Now imagine using AI to facilitate edge analytics. What does AI in Edge Computing mean When Edge computing is extended to the cloud it can be managed and consumed as if it were local infrastructure. The logic is simple. It’s the same as how humans find it hard to interact with infrastructure that is too far away. Smart Applications these rely on sending tons of information to the cloud. Data can be compromised in such situations. As such, security and privacy challenges may arise. Application developers will have to consider whether the bulk of information sent to the cloud contains personally identifiable information (PII) and whether storing it is in breach of privacy laws. They’ll also have to take the necessary measures to secure the information they store and prevent it from being stolen, or accessed or shared illegally. Now that is a lot of work! Enter “Intelligent Edge” computing used to save the day! Edge computing by itself will not replace the power of the cloud. It can, however, reduce cloud payloads drastically when used in collaboration with machine learning. transform the AI’s operation model into that of the human brain: perform routine and time-critical decisions at the edge and only refer to the cloud where more intensive computation and historical analysis is needed. Why use AI in edge computing Most mobile apps, IoT devices and other applications that work with AI and machine learning algorithms and applications rely on the processing power of the cloud or on a datacenter situated thousands of miles away. They have little or no intelligence to apply processing at the edge. Even if you show your favorite pet picture to your smart device a thousand times, it’ll still have to look it up in its cloud server in order to recognize if its a dog or a cat for the 1001st time. OK, who cares if it takes a couple of minutes more for my device to differentiate between a dog and a cat! Let’s consider a robot surgeon which wants to perform a sensitive operation on a patient. It will need to be able to analyze images and make decisions dozens of times per second. The round trip to the cloud would cause lags that could have severe consequences. God forbid, if there is a cloud outage or poor internet connectivity. To perform this task efficiently, faster and to reduce the back and forth communication involved between the cloud and the device, implementing AI in edge is a good idea. Top 7 AI for edge computing use cases that caught our attention Now that you are convinced that Intelligent edge or AI powered edge computing does have potential, here are some recent advancements in edge AI and some ways it is being used in the real world. #1 Consumer IoT: Microsoft’s $5 billion investment in IoT to empower the intelligent cloud and the intelligent edge One of the central design principles of Microsoft’s intelligent edge products and services is to secure data no matter where it is stored. Azure Sphere is one of their intelligent edge solutions to power and protect connected microcontroller unit (MCU)-powered devices. There are 9 billion MCU-powered devices shipping every year, which power everything from household stoves and refrigerators to industrial equipment. That’s intelligent edge for you on the consumer end of the application spectrum. Let’s look at the industrial application use case next. #2 Industrial IoT: GE adds edge analytics, AI capabilities to its industrial IoT suite To make a mark in the field of industrial internet of things (IIoT), GE Digital is adding features to its Predix platform as a service (PaaS). This will let industrial enterprises run predictive analytics as close as possible to data sources, whether they be pumps, valves, heat exchangers, turbines or even machines on the move. The main idea behind edge computing is to analyze data in near real-time, optimize network traffic and cut costs. GE Digital has been working to integrate the company's field service management (FSM) software with GE products and third-party tools. For example, artificial intelligence-enabled predictive analytics now integrate the Apache Spark AI engine to improve service time estimates. New application integration features let service providers launch and share FSM data with third-party mobile applications installed on the same device. Read the whole story on Network World. #3 Embedded computing and robotics: Defining (artificial) intelligence at the edge for IoT systems Machine intelligence has largely been the domain of computer vision (CV) applications such as object recognition. While artificial intelligence technology is thus far still in its infancy, its benefits for advanced driver assistance systems (ADAS), collaborative robots (cobots), sense-and-avoid drones, and a host of other embedded applications are obvious. Related to the origins of AI technology is the fact that most, if not all, machine learning frameworks were developed to run on data center infrastructure. As a result, the software and tools required to create CNNs/DNNs for embedded targets have been lacking. In the embedded machine learning sense, this has meant that intricate knowledge of both embedded processing platforms and neural network creation has been a prerequisite for bringing AI to the embedded edge – a luxury most organizations do not have or is extremely time-consuming if they do. Thanks to embedded silicon vendors, this paradigm is set to shift. Based on the power consumption benchmarks, AI technology is quickly approaching deeply embedded levels. Read the whole article on embedded computing design to know more about how Intelligent edge is changing our outlook towards embedded systems. #4 Smart grids: Grid Edge Control and Analytics Grid Edge Controllers are intelligent servers, deployed as an interface between the edge nodes and the utility’s core network. Smart Grid, as we know, is essentially the concept of establishing a two-way communication between distribution infrastructure, consumer and the utility head end using Internet Protocol. From residential rooftops to solar farms, commercial solar, electric vehicles and wind farms, smart meters are generating a ton of data. This helps utilities to view the amount of energy available and required, allowing their demand response to become more efficient, avoid peaks and reduce costs. This data is first processed in the Grid Edge Controllers that perform local computation and analysis of the data, only sending necessary actionable information over a wireless network to the Utility. #5 Predictive maintenance: Oil and Gas Remote Monitoring Using Internet of Things devices such as temperature, humidity, pressure, and moisture sensors, alongside internet protocol (IP) cameras and other technologies, oil and gas monitoring operations produce an immense amount of data which provide key insights into the health of their specific systems. Edge computing allows this data to be analysed, processed, and then delivered to end-users in real-time. This, in turn, enables control centers to access data as it occurs in order to foresee and prevent malfunctions or incidents before they occur. #6 Cloudless Autonomous Vehicles Self-driving cars and intelligent traffic management systems are already the talk of the town today and the integration of edge AI could be the next big step. When it comes to autonomous systems, safety is paramount. Any delay, malfunction, or anomaly within autonomous vehicles can prove to be fatal. Calculating a number of parameters at the same time, edge computing and AI enables safe and fast transportation with quick decision making capabilities. #7 Intelligent Traffic Management Edge computing is able to analyse and process data on the traffic hardware itself and finds ways to remove unnecessary traffic. This reduces the overall amount of data that needs to be transmitted across a given network and helps to reduce both operating and storage costs. What’s next for AI enabled edge computing? The intelligent edge will allow humans to simplify multi-faceted processes by replacing the manual process of sorting and identifying complex data, key insights and actionable plans. This forte of technology can help humans gain a competitive edge by having better decision-making, improved ROI, operational efficiency and cost savings. However, on the flip side, there are also many cons to machine learning based edge computing.. The cost of deploying and managing an edge will be considerably huge. With all rapidly evolving technologies- evaluating, deploying and operating edge computing solutions has its risks. A key risk area being -security. Tons of data needs to be made available for processing at the edge and where there is data, there is always a fear of data breach. Performing so many operations on the data also can be challenging. All-in-All even though the concept of incorporating AI into edge computing is exciting, some work does need to be done to get intelligent edge-based solutions l fully set up, functional and running smoothly in production. What’s your take on this digital transformation? Reinforcement learning model optimizes brain cancer treatment Tesla is building its own AI hardware for self-driving cars OpenAI builds reinforcement learning based system giving robots human like dexterity

0
0
6415

article-image-dopamine-a-tensorflow-based-framework-for-flexible-and-reproducible-reinforcement-learning-research-by-google

Savia Lobo

28 Aug 2018

3 min read

Dopamine: A Tensorflow-based framework for flexible and reproducible Reinforcement Learning research by Google

Savia Lobo

28 Aug 2018

3 min read

Yesterday, Google introduced a new Tensorflow-based framework named Dopamine, which aims to provide flexibility, stability, and reproducibility for both new and experienced RL researchers. This release also includes a set of colabs that clarify how to use the Dopamine framework. Dopamine is inspired by one of the main components in reward-motivated behavior in the brain. It also reflects a strong historical connection between neuroscience and reinforcement learning research. Its main aim is to enable a speculative research that drives radical discoveries. Dopamine framework feature highlights Ease of Use The two key considerations in Dopamine’s design are its clarity and simplicity. Its code is compact (about 15 Python files) and is well-documented. This is achieved by focusing on the Arcade Learning Environment (a mature, well-understood benchmark), and four value-based agents: DQN, C51, A carefully curated simplified variant of the Rainbow agent, and The Implicit Quantile Network agent, which was presented last month at the International Conference on Machine Learning (ICML). Reproducibility Google has provided the Dopamine code with full test coverage. These tests also serve as an additional form of documentation. Dopamine follows the recommendations given by Machado et al. (2018) on standardizing empirical evaluation with the Arcade Learning Environment. Benchmarking It is important for new researchers to be able to quickly benchmark their ideas against established methods. Following this, Google has provided the full training data of the four provided agents, across the 60 games supported by the Arcade Learning Environment. They have also provided a website where one can quickly visualize the training runs for all provided agents on all 60 games. Given below is a snapshot showcasing the training runs for the 4 agents on Seaquest, one of the Atari 2600 games supported by the Arcade Learning Environment. The x-axis represents iterations, where each iteration is 1 million game frames (4.5 hours of real-time play); the y-axis is the average score obtained per play. The shaded areas show confidence intervals from 5 independent runs. The Google community aims to empower researchers to try out new ideas, both incremental and radical with Dopamine ’s flexibility and ease-of-use. It is actively being used in Google’s research, giving them the flexibility to iterate quickly over many ideas. To know more about Dopamine in detail visit the Google AI blog. You can also check out its GitHubrepo. Build your first Reinforcement learning agent in Keras [Tutorial] Reinforcement learning model optimizes brain cancer treatment, reduces dosing cycles and improves patient quality of life OpenAI builds a reinforcement learning based system giving robots hhuman-likedexterity

0
0
6385

article-image-neural-network-intelligence-microsofts-open-source-automated-machine-learning-toolkit

Amey Varangaonkar

01 Oct 2018

2 min read

Neural Network Intelligence: Microsoft’s open source automated machine learning toolkit

Amey Varangaonkar

01 Oct 2018

2 min read

Google’s Cloud AutoML now has competition; Microsoft have released an open-source automated machine learning toolkit of their own. Dubbed as Neural Network Intelligence, this toolkit will allow data scientists and machine learning developers to perform tasks such as neural architecture search and hyperparameter tuning with relative ease. Per Microsoft’s official page, this toolkit will allow data scientists, machine learning developers and AI researchers with the necessary tools to customize their AutoML models across various training environments. The toolkit was announced in November 2017 and has been in the research phase for a considerable period of time, before it was released for public use recently. Who can use the Neural Network Intelligence toolkit? Microsoft’s highly anticipated toolkit for automated machine learning is perfect for you if: You want to try out different AutoML algorithms for training your machine learning model You want to run AutoML jobs in different training environments, including remote servers and cloud You want to implement your own AutoML algorithms and compare their performance with other algorithms You want to incorporate your AutoML models in your own custom platform With Neural Network Intelligence toolkit, data scientists and machine learning developers can train and customize their machine learning models more effectively. The tool is expected to go head to head with Auto-Keras, another open source AutoML library for deep learning. Auto-Keras has quickly generated quite a traction with more than 3000 stars on GitHub, suggested the growth in popularity of Automated Machine Learning. You can download and learn more about this AutoML toolkit on their official GitHub page. Read more What is Automated Machine Learning (AutoML)? Top AutoML libraries for building your ML pipelines Anatomy of an automated machine learning algorithm (AutoML)

0
0
6370

article-image-a-universal-bypass-tricks-cylance-ai-antivirus-into-accepting-all-top-10-malware-revealing-a-new-attack-surface-for-machine-learning-based-security

Sugandha Lahoti

19 Jul 2019

4 min read

A universal bypass tricks Cylance AI antivirus into accepting all top 10 Malware revealing a new attack surface for machine learning based security

Sugandha Lahoti

19 Jul 2019

4 min read

Researchers from Skylight Cyber, an Australian cybersecurity enterprise, have tricked Blackberry Cylance’s AI-based antivirus product. They identified a peculiar bias of the antivirus product towards a specific game engine and bypassed it to trick the product into accepting malicious malware files. This discovery means companies working in the field of artificial intelligence-driven cybersecurity need to rethink their approach to creating new products. The bypass is not just limited to Cylance, researchers chose it as it is a leading vendor in the field and is publicly available. The researchers Adi Ashkenazy and Shahar Zini from Skylight Cyber say they can reverse the model of any AI-based EPP (Endpoint Protection Platform) product, and find a bias enabling a universal bypass. Essentially meaning if you could truly understand how a certain model works, and the type of features it uses to reach a decision, you would have the potential to fool it consistently. How did the researchers trick Cylance into thinking bad is good? Cylance’s machine-learning algorithm has been trained to favor a benign file, causing it to ignore malicious code if it sees strings from the benign file attached to a malicious file. The researchers took advantage of this and appended strings from a non-malicious file to a malicious one, tricking the system into thinking the malicious file is safe and avoiding detection. The trick works even if the Cylance engine previously concluded the same file was malicious before the benign strings were appended to it. The Cylance engine keeps a scoring mechanism ranging from -1000 for the most malicious files, and +1000 for the most benign of files. It also whitelists certain families of executable files to avoid triggering false positives on legitimate software. The researchers suspected that the machine learning would be biased toward code in those whitelisted files. So, they extracted strings from an online gaming program that Cylance had whitelisted and appended it to malicious files. The Cylance engine tagged the files benign and shifted scores from high negative numbers to high positive ones. https://youtu.be/NE4kgGjhf1Y The researchers tested against the WannaCry ransomware, Samsam ransomware, the popular Mimikatz hacking tool, and hundreds of other known malicious files. This method proved successful for 100% of the top 10 Malware for May 2019, and close to 90% for a larger sample of 384 malware. “As far as I know, this is a world-first, proven global attack on the ML [machine learning] mechanism of a security company,” told Adi Ashkenazy, CEO of Skylight Cyber to Motherboard, who first reported the news. “After around four years of super hype [about AI], I think this is a humbling example of how the approach provides a new attack surface that was not possible with legacy [antivirus software].” Gregory Webb, chief executive officer of malware protection firm Bromium Inc., told SiliconAngle that the news raises doubts about the concept of categorizing code as “good” or “bad.” “This exposes the limitations of leaving machines to make decisions on what can and cannot be trusted,” Webb said. “Ultimately, AI is not a silver bullet.” Martijn Grooten, a security researcher also added his views to the Cylance Bypass story. He states, “This is why we have good reasons to be concerned about the use of AI/ML in anything involving humans because it can easily reinforce and amplify existing biases.” The Cylance team have now confirmed the global bypass issue and will release a hotfix in the next few days. “We are aware that a bypass has been publicly disclosed by security researchers. We have verified there is an issue which can be leveraged to bypass the anti-malware component of the product. Our research and development teams have identified a solution and will release a hotfix automatically to all customers running current versions in the next few days,” the team wrote in a blog post. You can go through the blog post by Skylight Cyber researchers for additional information. Microsoft releases security updates: a “wormable” threat similar to WannaCry ransomware discovered 25 million Android devices infected with ‘Agent Smith’, a new mobile malware FireEye reports infrastructure-crippling Triton malware linked to Russian government tech institute

0
0
6298

article-image-timescaledb-goes-distributed-implements-chunking-over-sharding-for-scaling-out

Sugandha Lahoti

22 Aug 2019

5 min read

TimescaleDB goes distributed; implements ‘Chunking’ over ‘Sharding’ for scaling-out

Sugandha Lahoti

22 Aug 2019

5 min read

TimeScaleDB announced yesterday that they are going distributed; this version is currently in private beta with the public version slated for later this year. They are also bringing PostgreSQL back. However, with PostgreSQL, a major problem is scaling out. To address this, TimeScaleDB does not implement traditional sharding, instead, using ‘chunking’. What is TimescaleDB’s chunking? In TimescaleDB, chunking is the mechanism which scales PostgreSQL for time-series workloads. Chunks are created by automatically partitioning data by multiple dimensions (one of which is time). In a blog post, TimeScaleDB specifies, “this is done in a fine-grain way such that one dataset may be comprised of 1000s of chunks, even on a single node.” Chunking offers a wide set of capabilities unlike sharding, which only offers the option to scale out. These include scaling-up (on the same node) and scaling-out (across multiple nodes). It also offers elasticity, partitioning flexibility, data retention policies, data tiering, and data reordering. TimescaleDB also automatically partitions a table across multiple chunks on the same instance, whether on the same or different disks. TimescaleDB’s multi-dimensional chunking auto-creates chunks, keeps recent data chunks in memory, and provides time-oriented data lifecycle management (e.g., for data retention, reordering, or tiering policies). However, one issue is the management of the number of chunks (i.e., “sub-problems”). For this, they have come up with hypertable abstraction to make partitioned tables easy to use and manage. Hypertable abstraction makes chunking manageable Hypertables are typically used to handle a large amount of data by breaking it up into chunks, allowing operations to execute efficiently. However, when the number of chunks is large, these data chunks can be distributed over several machines by using distributed hypertables. Distributed hypertables are similar to normal hypertables, but they add an additional layer of hypertable partitioning by distributing chunks across data nodes. They are designed for multi-dimensional chunking with a large number of chunks (from 100s to 10,000s), offering more flexibility in how chunks are distributed across a cluster. Users are able to interact with distributed hypertables similar to a regular hypertable (which itself looks just like a regular Postgres table). Chunking does not put additional burden on applications and developers because TimescaleDB does not interact directly with chunks (and thus do not need to be aware of this partition mapping themselves, unlike in some sharded systems). The system also does not expose different capabilities for chunks than the entire hypertable. TimescaleDB goes distributed TimescaleDB is already available for testing in private beta as for selected users and customers. The initial licensed version is expected to be widely available. This version will support features such as high write rates, query parallelism, predicate push down for lower latency, elastically growing a cluster to scale storage and compute, and fault tolerance via physical replica. Developers were quite intrigued by the new chunking process. A number of questions were asked on Hacker News, duly answered by TimescaleDB creators. One of the questions put forth is related to the Hot partition problem. A user asks, “The biggest limit is that their "chunking" of data by time-slices may lead directly to the hot partition problem -- in their case, a "hot chunk." Most time series is 'dull time' -- uninteresting time samples of normal stuff. Then, out of nowhere, some 'interesting' stuff happens. It'll all be in that one chunk, which will get hammered during reads.” To which Erik Nordström, Timescale engineer replied, “ TimescaleDB supports multi-dimensional partitioning, so a specific "hot" time interval is actually typically split across many chunks, and thus server instances. We are also working on native chunk replication, which allows serving copies of the same chunk out of different server instances. Apart from these things to mitigate the hot partition problem, it's usually a good thing to be able to serve the same data to many requests using a warm cache compared to having many random reads that thrashes the cache.” Another question asked said, “In this vision, would this cluster of servers be reserved exclusively for time series data or do you imagine it containing other ordinary tables as well?” To which, Mike Freedman, CTO of TimeScale answered, “We commonly see hypertables (time-series tables) deployed alongside relational tables, often because there exists a relation between them: the relational metadata provides information about the user, sensor, server, security instrument that is referenced by id/name in the hypertable. So joins between these time-series and relational tables are often common, and together these serve the applications one often builds on top of your data. Now, TimescaleDB can be installed on a PG server that is also handling tables that have nothing to do with its workload, in which case one does get performance interference between the two workloads. We generally wouldn't recommend this for more production deployments, but the decision here is always a tradeoff between resource isolation and cost.” Some thought sharding remains the better choice even if chunking improves performance. https://twitter.com/methu/status/1164381453800525824 Read the official announcement for more information. You can also view the documentation. TimescaleDB 1.0 officially released Introducing TimescaleDB 1.0 RC, the first OS time-series database with full SQL support Zabbix 4.2 release packed with modern monitoring system for data collection, processing and visualization

0
0
6268

article-image-elon-musk-reveals-big-plans-with-neuralink

Guest Contributor

18 Sep 2018

3 min read

Elon Musk reveals big plans with Neuralink

Guest Contributor

18 Sep 2018

3 min read

Be it a tweet about taking the company private or smoking weed on a radio show, Elon Musk has been in news for all the wrong reasons recently and he is in news again but this time for what he is best admired as a modern day visionary. As per reports the Tesla and SpaceX founder is working on a 'superhuman' product that will connect your brain to a computer. We all know Musk along with eight others founded a company called Neuralink two years ago. The company has been developing implantable brain–computer interfaces, better known as BCIs. While in the short-term the company’s aim is to use the technology to treat brain diseases, Musk’s eventual goal is human enhancement, which he believes will make us more intelligent and powerful than even AI. According to hints he gave a week ago, Neuralink may soon be close to announcing a product unlike anything we have seen: A brain computer interface. Appearing on the Joe Rogan Experience podcast last week, Musk stated that he’ll soon be announcing a new product – Neuralink – which will connect your brain to a computer, thus making you superhuman. When asked about Neuralink, Musk said "I think we’ll have something interesting to announce in a few months that’s better than anyone thinks is possible. Best case scenario, we effectively merge with AI. It will enable anyone who wants to have superhuman cognition. Anyone who wants. How much smarter are you with a phone or computer or without? You’re vastly smarter, actually. You can answer any question pretty much instantly. You can remember flawlessly. Your phone can remember videos [and] pictures perfectly. Your phone is already an extension of you. You’re already a cyborg. Most people don’t realise you’re already a cyborg. It’s just that the data rate, it’s slow, very slow. It’s like a tiny straw of information flow between your biological self and your digital self. We need to make that tiny straw like a giant river, a huge, high-bandwidth interface." If we visualize what Musk said, it feels like a scene straight from a Hollywood movie. However, many of the creations, from a decade ago, that were thought to belong solely in the world of science-fiction, have become a reality now. Musk argues that through our over-dependence on smartphones, we have already taken the first step towards our cyborg future. Neuralink is an attempt to just accelerate the process by leaps and bounds. That's not all, Elon Musk was also quoted saying on CNBC. "If your biological self dies, you can upload into a new unit. Literally, with our Neuralink technology". Read the full news on CNBC. About Author Sandesh Deshpande is currently working as a System Administrator for Packt Publishing. He is highly interested in Artificial Intelligence and Machine Learning. Tesla is building its own AI hardware for self-driving cars Elon Musk’s tiny submarine is a lesson in how not to solve problems in tech. DeepMind, Elon Musk, and others pledge not to build lethal AI

0
0
6218

Tech News - Data

Google open-sources GPipe, a pipeline parallelism Library to scale up Deep Neural Network training

Microsoft announces the first public preview of SQL Server 2019 at Ignite 2018

Red Hat drops MongoDB over concerns related to its Server Side Public License (SSPL)

Introducing PostgREST, a REST API for any PostgreSQL database written in Haskell

AmoebaNets: Google’s new evolutionary AutoML

Alibaba’s chipmaker launches open source RISC-V based ‘XuanTie 910 processor’ for 5G, AI, IoT and self-driving applications

How Deep Neural Networks can improve Speech Recognition and generation

Hadoop 3.2.0 released with support for node attributes in YARN, Hadoop submarine and more

Facebook AI introduces Aroma, a new code recommendation tool for developers

Intelligent Edge Analytics: 7 ways machine learning is driving edge computing adoption in 2018

Trending Topics

Dopamine: A Tensorflow-based framework for flexible and reproducible Reinforcement Learning research by Google

Neural Network Intelligence: Microsoft’s open source automated machine learning toolkit

A universal bypass tricks Cylance AI antivirus into accepting all top 10 Malware revealing a new attack surface for machine learning based security

TimescaleDB goes distributed; implements ‘Chunking’ over ‘Sharding’ for scaling-out

Elon Musk reveals big plans with Neuralink