Data | 0 articles | Tech News, Tutorials & Expert Insights

03 Nov 2017

4 min read

3rd Nov.' 17 - Headlines

03 Nov 2017

Apache Kafka 1.0, IBM Watson upgrades, and Cisco’s first voice assistant for meetings, in today’s trending stories in data science news. New version releases Apache Kafka goes 1.0 Open source distributed streaming platform Apache Kafka has released its version 1.0.0. Apache Kafka 1.0 includes performance improvements with exactly-once semantics, significantly faster TLS and CRC32C implementations with Java 9 support, significantly faster controlled shutdown, and better JBOD support, among other general improvements and bug fixes, according to the official announcement. Apache Kafka is in use at large and small companies worldwide, including Capital One, Goldman Sachs, ING, LinkedIn, Netflix, Pinterest, Rabobank, Target, The New York Times, Uber, Yelp, and Zalando, among others. Pentaho 8.0 integrates Spark and Kafka, boosts real-time data processing capabilities Pentaho 8.0, the next generation of Pentaho data integration and analytics platform software, has been unveiled at the PentahoWorld 2017 user conference. The new version comes with a better preparedness for real time data deluge. Pentaho 8.0 fully supports stream data ingestion and processing using its native engine or Spark. It also now enables real-time processing with specialized steps that connect Pentaho Data Integration (PDI) to Kafka. The 8.0 version adds support for the Knox Gateway used for authenticating users to Hadoop services. It is now easier to read and write to popular big data file formats and process with Spark using Pentaho’s visual editing tools. To increase productivity across the data pipeline, Pentaho 8.0 adds new features such as granular filters for preparing data, improved repository usability and easier application auditing. Platform upgrades and enhancements IBM announces set of upgrades to Watson Data Platform IBM has announced several upgrades to its Watson Data Platform, giving data professionals a stronger foundation for AI applications. The new services include Data Catalog, which creates a complete, searchable index of structured and unstructured data in a system; Data Refinery, a tool to prepare, cleanse and process data for AI purposes; and Analytics Engine, an intelligent repository for data that combines Apache Spark and Apache Hadoop, powered by IBM Cloud Object storage. Adobe Analytics enhanced with advanced features for faster analysis, better customer intelligence Adobe has added several advanced features into Adobe Analytics, that provide employees with intelligence curated for their roles throughout the organization. Adobe Analytics will now have Context-Aware Sessions, Audience Analytics, and new Visualizations features – which make it easier for brands to combine dimensions, metrics and date range in any combination with the ability to query billions of rows of data in seconds. Also, there will be improvements to the virtual report suite for mobile teams.These new capabilities will enable increased collaboration, faster analysis and improved customer intelligence, allowing high-growth brands to derive meaningful insights faster, and with more precision. Breakthrough innovations in AI Cisco Spark Assistant: World’s first AI Voice Assistant for Meetings Cisco Spark Assistant is going to be the world’s first enterprise-ready voice assistant for meetings, the company announced at Cisco Partner Summit. Cisco Spark Assistant will be available first on the Cisco Spark Room Series portfolio, including the new flagship Cisco Spark Room 70 , the company said. “During the next few years, AI meeting bots will be joining our work teams. When they do, people will be able to ditch the drudgery of meeting setup and other logistics to become more creative than ever,” said Rowan Trollope, SVP and GM at Cisco. Using machine learning, Factual’s Engine will tell developers when to engage users Location data provider Factual has launched Engine, a mobile software development kit (SDK) using which developers can add location data and intelligence into mobile apps. Engine uses machine learning to help developers know the right time to engage users. The AI considers business operating hours, device usage patterns, speed and direction of travel to determine the specific circumstance of a user. “The bar for smart and intelligent apps is rising exponentially, and developers demand solutions that help them provide personalized, effortless experiences to end users,” said Gil Elbaz, founder and CEO of Factual. “Engine is uniquely able to understand a device's exact location and movement, and using that location intelligence, design customized outcomes for users.” Engine is available for both Android and iOS.

0
0
1438

article-image-equity-ai-exchange-traded-fund

Abhishek Jha

02 Nov 2017

3 min read

World's first AI exchange-traded funds could unleash a new era of automated trading systems

Abhishek Jha

02 Nov 2017

3 min read

The jury may be out on whether artificial intelligence has beaten humans in their own games, but what started as a smart business strategy is today almost everywhere. Even in stock markets. In Canada, robots have made their trading debut. In what could be the first global equity exchange-traded fund (ETF) run by machines, Horizons ETFs Management Inc.’s AI exchange-traded fund has already hit the market. In securities trading, an exchange-traded fund is a marketable security that trades on stock exchanges. Its portfolio is managed by some form of trust company, managed by human beings of course. Which is why you are bound to be apprehensive – will you invest in an ETF without a human being at the other end who convinces you with his decision backed by years of market research and real life experience? “I’m going to be buying some but I’m buying it as a nervous investor myself,” Steve Hawkins, co-chief executive officer of Horizons ETFs Management Canada, said before the Horizons Active AI Global Equity ETF began trading on the Toronto Stock Exchange, “We don’t know what the computer will do.” The AI-run ETF is listed on the exchange under the ticker MIND. And while it is still being sub-advised by Mirae Asset Global Investments, the AI system’s investment strategy is to analyze data from 50 investment metrics and obtain investment patterns yielding actionable insights. Experts feel MIND can be a big growth prospect in the ETF space. Just that unlike your portfolio manager guy next door, it will never be able to explain its decision. “We don’t know why it’s going to be making those independent decisions, but from our rigorous testing we believe that it’s going to make the right decisions,” Hawkins said. To elaborate on the word rigorous, the AI system developed by Korea’s Qraft Technologies was rigorously back-tested over 10 years, during which it learned how the market reacts when the data is interpreted in a certain way, and how to make smart investments in the process. True, artificial intelligence in stock markets may seem like venturing into uncharted waters, but Hawkins is upbeat the system will come out smarter than the average portfolio manager. “AI can do the work of a team of global strategists, can look at millions of data points very quickly, where a team of strategists would have to work 24/7, 365 days a year,” he says. “It doesn’t bring in investor bias or emotion with respect to any of its decisions, and we hope to see output that will be able to consistently outperform human decision-making.” Count these two as the most important factors to prefer AI over humans. Artificial intelligence does not have the weakness of human intelligence. It doesn’t take emotional decisions; it doesn’t claim to err is human. Brace yourself for an AI investment manager in future.

0
0
1571

article-image-trending-datascience-news-2nd-nov-17-headlines

Packt Editorial Staff

02 Nov 2017

5 min read

2nd Nov.' 17 - Headlines

Packt Editorial Staff

02 Nov 2017

5 min read

Keras update, TensorFlow Eager execution, new Blockchain project Thunder token, and more in today’s data science news. Keras 2.0.9 released The latest version of Keras, 2.0.9, has been released on GitHub with several RNN improvements, easier multi-GPU data parallelism, and a range of API changes, in addition to added bug fixes and performance improvements such as the native support for NCHW data layout in TensorFlow. Implementation changes in Keras 2.0.9 result in a different scaling and normalization behavior. Google announces “eager execution” for TensorFlow Google has unveiled a new interface “eager execution” making it easier to get started with TensorFlow. Eager execution is an imperative, define-by-run interface where operations are executed immediately as they are called from Python. Announcing the release on its official blog, Google said the benefits of eager execution interface includes fast debugging, support for dynamic models, support for custom and high-order gradients, and almost all of the available TensorFlow operations. Google is soliciting feedback on this experimental feature. Yellowfin 7.4 released BI platform Yellowfin has announced the release of Yellowfin 7.4. While augmented data discovery is a notable feature in the product, the addition of ETL enables the company to add data science platforms such as H2O, incorporating data science components like Predictive Model Markup Language (PMML) and Portable Format for Analytics (PFA). This means Yellowfin is now an end-to-end platform for data scientists. PASS 2017 Summit in data science news Microsoft sets foot on hybrid Azure SQL databases With new advances to its SQL Server 2017 solution and Azure data services, Microsoft intends to form the ultimate hybrid data platform. The company made new on-premises and cloud announcements at PASS Summit 2017. Among the new tools, Microsoft announced the Azure SQL Database Managed Instance and Azure Database Migration Service that enable users to ‘lift and shift’ on-premises SQL Server workloads. Both services are available in a private preview. To help integrate some of these newly launched tools, Microsoft said it put features like integration with Python and R scripts into SQL Server 2017. Microsoft announces SQL Operations Studio The PASS 2017 summit saw another significant announcement from Microsoft for future: SQL Operations Studio. The studio is a free, lightweight tool for “modern database development and operations on Windows, Mac or Linux machines for SQL Server, Azure SQL Database, and Azure SQL Data Warehouse.” It includes smart T-SQL code snippets, customizable dashboards and support for popular command line tools. New Blockchain platforms in News Thunder token: Cornell professor announces new blockchain project that is faster and scalable A renowned computer science professor from Cornell University is set to launch a new blockchain project called “thunder token.” Known for her work on the fundamentals of distributed systems, Elaine Shi claimed that thunder token will be able to achieve speeds 1,000x greater than existing technologies, while also addressing the common blockchain problem of scalability. Making the announcement at ethereum's annual developer conference Devcon3, Shi said the new initiative is based on the thunderella protocol – a paper she co-authored with Cornell associate professor Rafael Pass. In thunder token, the protocol proposes a split set-up so that transactions are confirmed very quickly, with the blockchain only being used in the case of emergencies. The rest of the time, thunder token will use something a little less familiar – a system of agents that follows the direction of a "leader" to vote on which transactions are made according to the rules. It is not yet clear if the protocol will be purely private, or open to the public. FundRequest develops unique blockchain incentive platform for open source projects In a new approach towards open source that could benefit both developers as well as organizations, FundRequest has launched a new blockchain platform for the funding, claiming, and rewarding of open source contributions. With FundRequest, users will be given access to a decentralized ecosystem that provides code-enforced guarantees against corruption. Funding is only sent if a project’s functionality can be demonstrated, and will be withheld otherwise. FundRequest’s plugin will allow users to fund open requests on networks like Github with a fund button. After setting up a Github request ticket, users can use the FundRequest interface to fund them which generates a unique smart contract to manage the payment of funds. AI platforms in News Hikvision announces its AI Cloud platform At an artificial intelligence summit organized in Shenzhen, China, Hikvision unveiled its AI Cloud platform. The company said that Hikvision AI Cloud was developed to solve real world challenges existing in different vertical markets, and to create continuous value to end users. Hu Yangzhong, CEO of Hikvision who addressed the forum, noted how the ongoing trend of engineering AI algorithms into edge devices was making the edge more intelligent. "Edge computing uses local computing to enable analytics at the source of the data. With AI algorithms woven into the edge devices, only selected information such as an individual or a vehicle in a video image will be extracted and sent which significantly enhances the transition efficiency and reduces the network bandwidth, while still sustaining high quality and accuracy," Hu said. Dedrone unveils DroneTracker 3 for advanced drone detection through machine learning Drone detection technology developer Dedrone has announced the upgraded version of their software, named DroneTracker 3, which significantly expands the scope of current Dedrone features. DroneTracker 3 includes enhanced updates such as automated summary reporting, improved detection and reliability, enterprise-grade security and management, and an overall simplified set up which is easy to deploy. “Ultimately, DroneTracker 3 identifies how many drones are in an organization’s airspace, a question which was nearly impossible to answer prior to the launch of DroneTracker,” commmented Joerg Lamprecht, CEO and co-founder of Dedrone.

0
0
1588

Abhishek Jha

01 Nov 2017

3 min read

Sony resurrects robotic pet Aibo with advanced AI

Abhishek Jha

01 Nov 2017

3 min read

A decade back when CEO Howard Stringer decided to discontinue Sony’s iconic entertainment robot AIBO, its progenitor Toshitada Doi had famously staged a mock funeral lamenting, more than Aibo’s disbandment, the death of Sony’s risk-taking spirit. Today as the Japanese firm’s sales have soared to a decade high beating projected estimates, Aibo is back from the dead. The revamped pet looks cuter than ever before, after nearly a decade of hold. And it has been infused with a range of sensors, cameras, microphones and upgraded artificial intelligence features. The new Aibo is an ivory-white, plastic-covered hound which even has the ability to connect to mobile networks. Using actuators, it can move its body remarkably well, while using two OLED panels in eyes to exhibit an array of expressions. Most importantly, it comes with a unique ‘adaptive’ behavior that includes being able to actively recognize its owner and running over to them, learning and interacting in the process – detecting smiles and words of praises – with all those head and back scratches. In short, a dog in real without canine instincts. Priced at around $1,735 (198,000 Yen), Aibo includes a SIM card slot to connect to internet and access Sony’s AI cloud to analyze and learn how other robot dogs are behaving on the network. Sony says it does not intend to replace a digital assistant like Google Home but that Aibo could be a wonderful companion for children and families, forming an “emotional bond” with love, affection, and joy. The cloud service that powers Aibo’s AI is however expensive, and a basic three-year subscription plan is priced at $26 (2,980 Yen) per month. Or you could sign up upfront for three years at around $790 (90,000 Yen). As far as the battery life is concerned, the robot will take three hours to fully charge itself once it gets dissipated after two hours of activity. “It was a difficult decision to stop the project in 2006, but we continued development in AI and robotics,” Sony CEO Kazuo Hirai said speaking at a launch event. “I asked our engineers a year and a half ago to develop Aibo because I strongly believe robots capable of building loving relationships with people help realize Sony’s mission.” When Sony had initially launched AIBO in 1999, it was well ahead of its time. But after the initial euphoria, the product somehow failed to get mainstream buyers as reboots after reboots failed to generate profits. That time clearly Sony had to make a decision as its core electronics business struggled in price wars. Today, times are different – AI fever has gripped the tech world. A plastic bone (‘aibone’) for the robotic dog costs you around 2,980 Yen. And that’s the price you pay for a keeping a robotic buddy around. The word “aibo” literally means a companion after all.

0
0
2120

article-image-trending-datascience-news-1st-nov-17-headlines

Packt Editorial Staff

01 Nov 2017

5 min read

1st Nov.' 17 - Headlines

Packt Editorial Staff

01 Nov 2017

5 min read

Google’s Firebase Predictions, QuickPivot’s machine learning suite Ada, Tensor algebra software Taco, and more in today’s data science news. Google Firebase in data science news Google applies machine learning expertise to create Firebase Predictions for user segmentation At the ongoing 2017 Firebase Dev Summit at Amsterdam, Google has unveiled Firebase Predictions, which can help “predict what users are going to do, before they actually do it.” Firebase Predictions uses machine learning on the analytics data to create dynamic user groups based on users' predicted behavior. These predictions are automatically available for use with Firebase Remote Config, the Notifications composer, and A/B testing. Google said that with Remote Config, users can boost conversions with a custom experience based on each user’s predicted behavior. And while Notifications composer will deliver the right message to the right user groups, A/B testing can help evaluate the effectiveness of prediction-based strategies. NVIDIA in News NVIDIA previews NVDLA deep learning processor it open sourced for deep neural network inference Recently, NVIDIA had open sourced the NVDLA deep learning processor that was based on the architecture of its "Xavier" automotive processor. A short for “NVIDIA Deep Learning Accelerator,” the NVDLA was created to promote a standard way to design deep learning inference accelerators. Now, at a recent briefing, NVIDIA's Vice President and General Manager of Autonomous Machines Deepu Talla has explained that the company's open-sourcing decision was taken into consideration thinking it could expand demand for cloud-based training of deep learning models. Currently, NVDLA is compatible with Linux while it could be ported to other operating systems. The modular NVDLA accelerator architecture includes a convolution core, single data processor, planar data processor, channel data processor, dedicated memory and data reshape engine. NVIDIA initiates new AI partnerships, training courses for Deep Learning Institute Expanding the scope of its Deep Learning Institute (DLI), NVIDIA said it is entering into new partnerships with Booz Allen Hamilton and deeplearning.ai to further broaden the range of its training content on artificial intelligence for thousands of students, developers and government specialists. The company has incorporated new University Ambassador Program under which instructors worldwide including professors from Arizona State, Harvard, Hong Kong University of Science and Technology and UCLA, will teach students critical job skills and practical applications of AI at no cost. The new courses will impart domain-specific applications of deep learning for finance, natural language processing, robotics, video analytics and self-driving cars. DLI is also bringing free AI training to young people through the nonprofit organization AI4ALL. Machine Learning suite Ada in News QuickPivot incorporates predictive models into marketing campaigns with machine learning suite Ada To uncover insights that could drive revenue growth, QuickPivot has launched Ada, a machine learning suite of three predictive marketing models. The three models are named Churn, Basket, and Cluster. Churn applies machine learning to calculate whether a customer will churn in 30, 60 or 90 days and understand how to best engage them before it’s too late. Basket increases average customer spend by understanding which of your products are often purchased together. Cluster predicts which purchase behaviors apply to certain demographics, finding both trends and anomalies. Tensor algebra compiler in News Taco: ‘Tensor algebra’ software speeds computations involving ‘sparse tensors’ 100-fold A team of researchers from MIT, French Alternative Energies and Atomic Energy Commission, and Adobe Research have created a new system called “Taco” that automatically produces code optimized for sparse data. Taco stands for tensor algebra compiler, and it speeds up computations 100-fold against the existing software packages. "Sparse representations have been there for more than 60 years," says Saman Amarasinghe, MIT professor who worked as senior author on the paper. "But nobody knew how to generate code for them automatically. People figured out a few very specific operations—sparse matrix-vector multiply, sparse matrix-vector multiply plus a vector, sparse matrix-matrix multiply, sparse matrix-matrix-matrix multiply. The biggest contribution we make is the ability to generate code for any tensor-algebra expression when the matrices are sparse." Other data science news Pyramid Analytics unveils platform-agnostic analytics OS “Pyramid 2018” Pyramid Analytics has announced the launch of Pyramid 2018, a server-based, multi-user analytics OS which helps conduct advanced self-service analytics without IT help. Using Pyramid 2018, business users can manage data strategies across any environment (on-premises, in the cloud, or across hybrid deployments), irrespective of the technology (like Oracle, SAP, Microsoft, Big Data, etc.). Pyramid 2018 also offers multiple AI engines and language support such as R, Python, TensorFlow, Weka, MLIB, SAS runtime and others, enabling organizations to integrate machine learning algorithms into their data activities. IBM launches machine learning tool “Trusteer New Account Fraud” to prevent bank fraud To help stop bank fraud, IBM has launched a new security tool named “IBM Trusteer New Account Fraud” that will apply machine learning and analytics to identify and stop cyber criminals from opening fraudulent bank accounts. The new tool, which will be added onto the Trusteer Pinpoint Detect portfolio, will bring together the device and network information used to open a new account, specifically looking into both the positive information as well as the negative indicators in the transaction process. The tool also uses behavioural analytics to verify fraud patterns.

0
0
1411

article-image-dr-brandon-explains-word-vectors-word2vec-jon

Aarthi Kumaraswamy

01 Nov 2017

6 min read

Dr. Brandon explains Word Vectors (word2vec) to Jon

Aarthi Kumaraswamy

01 Nov 2017

6 min read

[box type="shadow" align="" class="" width=""]Dr. Brandon: Welcome back to the second episode of 'Date with Data Science'. Last time, we explored natural language processing. Today we talk about one of the most used approaches in NLP: Word Vectors. Jon: Hold on Brandon, when we went over maths 101, didn't you say numbers become vectors when they have a weight and direction attached to them. But numbers and words are Apples and Oranges! I don't understand how words could also become vectors. Unless the words are coming from my movie director and he is yelling at me :) ... What would the point of words having directions be, anyway? Dr. Brandon: Excellent question to kick off today's topic, Jon. On an unrelated note, I am sure your director has his reasons. The following is an excerpt from the book Mastering Machine Learning with Spark 2.x by Alex Tellez, Max Pumperla and Michal Malohlava. [/box] Traditional NLP approaches rely on converting individual words--which we created via tokenization--into a format that a computer algorithm can learn (that is, predicting the movie sentiment). Doing this required us to convert a single review of N tokens into a fixed representation by creating a TF-IDF matrix. In doing so, we did two important things behind the scenes: Individual words were assigned an integer ID (for example, a hash). For example, the word friend might be assigned to 39,584, while the word bestie might be assigned to 99,928,472. Cognitively, we know that friend is very similar to bestie; however, any notion of similarity is lost by converting these tokens into integer IDs. By converting each token into an integer ID, we consequently lose the context with which the token was used. This is important because, in order to understand the cognitive meaning of words, and thereby train a computer to learn that friend and bestie are similar, we need to understand how the two tokens are used (for example, their respective contexts). Given this limited functionality of traditional NLP techniques with respect to encoding the semantic and syntactic meaning of words, Tomas Mikolov and other researchers explored methods that employ neural networks to better encode the meaning of words as a vector of N numbers (for example, vector bestie = [0.574, 0.821, 0.756, ... , 0.156]). When calculated properly, we will discover that the vectors for bestie and friend are close in space, whereby closeness is defined as a cosine similarity. It turns out that these vector representations (often referred to as word embeddings) give us the ability to capture a richer understanding of text. Interestingly, using word embeddings also gives us the ability to learn the same semantics across multiple languages despite differences in the written form (for example, Japanese and English). For example, the Japanese word for movie is eiga; therefore, it follows that using word vectors, these two words, should be close in the vector space despite their differences in appearance. Thus, the word embeddings allow for applications to be language-agnostic--yet another reason why this technology is hugely popular! Word2vec explained First things first: word2vec does not represent a single algorithm but rather a family of algorithms that attempt to encode the semantic and syntactic meaning of words as a vector of N numbers (hence, word-to-vector = word2vec). We will explore each of these algorithms in depth in this chapter, while also giving you the opportunity to read/research other areas of vectorization of text, which you may find helpful. What is a word vector? In its simplest form, a word vector is merely a one-hot-encoding, whereby every element in the vector represents a word in our vocabulary, and the given word is encoded with 1 while all the other words elements are encoded with 0. Suppose our vocabulary only has the following movie terms: Popcorn, Candy, Soda, Tickets, and Blockbuster. Following the logic we just explained, we could encode the term Tickets as follows: Using this simplistic form of encoding, which is what we do when we create a bag-of-words matrix, there is no meaningful comparison we can make between words (for example, is Popcorn related to Soda; is Candy similar to Tickets?). Given these obvious limitations, word2vec attempts to remedy this via distributed representations for words. Suppose that for each word, we have a distributed vector of, say, 300 numbers that represent a single word, whereby each word in our vocabulary is also represented by a distribution of weights across those 300 elements. Now, our picture would drastically change to look something like this: Now, given this distributed representation of individual words as 300 numeric values, we can make meaningful comparisons among words using a cosine similarity, for example. That is, using the vectors for Tickets and Soda, we can determine that the two terms are not related, given their vector representations and their cosine similarity to one another. And that's not all we can do! In their ground-breaking paper, Mikolov et. al also performed mathematical functions of word vectors to make some incredible findings; in particular, the authors give the following math problem to their word2vec dictionary: V(King) - V(Man) + V(Woman) ~ V(Queen) It turns out that these distributed vector representations of words are extremely powerful in comparison questions (for example, is A related to B?), which is all the more remarkable when you consider that this semantic and syntactic learned knowledge comes from observing lots of words and their context with no other information necessary. That is, we did not have to tell our machine that Popcorn is a food, noun, singular, and so on. How is this made possible? Word2vec employs the power of neural networks in a supervised fashion to learn the vector representation of words (which is an unsupervised task). The above is an excerpt from the book Mastering Machine Learning with Spark 2.x by Alex Tellez, Max Pumperla and Michal Malohlava. To learn more about the word2vec and doc2vec algorithms such as continuous-bag-of-words (CBOW), skip-gram, cosine similarity, distributed memory among other models and to build applications based on these, check out the book.

0
0
2270

Packt Editorial Staff

31 Oct 2017

5 min read

31st Oct.' 17 - Headlines

Packt Editorial Staff

31 Oct 2017

5 min read

Linux and AT&T’s project Acumos, blockchain platform SophiaTX, and more in today’s data science news. Project Acumos in News Acumos: AT&T, Tech Mahindra introduce open source AI platform hosted by Linux Foundation Linux Foundation, AT&T, and Tech Mahindra have together founded an open source artificial intelligence project “Acumos” that will help developers to build, share and deploy AI applications. The platform, which is being designed with both coders as well as non-coders in mind, will be launched in early 2018. While AT&T and Tech Mahindra are contributing the project code, Linux Foundation will host the platform and its AI marketplace. “Our goal with open sourcing the Acumos platform is to make building and deploying AI applications as easy as creating a website,” Mazin Gilbert, vice president of advanced technology at AT&T Labs, said in a statement. Blockchain in News SophiaTX open source platform can integrate blockchain with SAP Equidato Technologies AG has introduced a new project that can embed blockchain into major ERP and financial software systems such as SAP. Named SophiaTX, the platform includes three main components: a blockchain designed specifically for business environments, a development platform with integration application programming interfaces (APIs) to SAP and other ERP software, and a marketplace for companies and individuals to buy and sell apps. SophiaTX uses proprietary blockchain technology from DECENT, an open source blockchain content distribution platform. “With blockchain, most of the disruption comes from new entrants into the ecosystem. By opening the network for anyone to join and participate, SophiaTX has the best possible chance for global adoption,” CEO Jaroslav Kacina said. Sony could use blockchain to safeguard PlayStation Network Sony could one day use blockchain, which secures data in cryptographic "blocks", as a means to better protect PlayStation Network users. Reportedly, Sony has applied for a patent that would use the versatility of blockchain as another layer of cybercrime prevention by using Multi-factor Authentication (MFA). If this works out, users would be sent an encrypted verification codes via the blockchain to use alongside traditional username/password login details, thus further securing the transactions and data transfers. Other Data science News NVIDIA announces availability of cloud container registry for AI developers worldwide NVIDIA has launched NVIDIA GPU Cloud (NGC) container registry for AI developers worldwide. The cloud-based service is available immediately to users of the recently announced Amazon Elastic Compute Cloud (Amazon EC2) P3 instances featuring NVIDIA Tesla V100 GPUs. NVIDIA said it is planning to extend support to other cloud platforms soon. Developers who want to use NGC container registry can follow a three-step process: Sign up for a no-cost NGC account at www.nvidia.com/ngcsignup; Run an optimised NVIDIA image on cloud service provider platform; and Pull containers from NGC and get started. Deepo: The Docker image that comes with all popular deep learning frameworks Deepo is a Docker image with a full reproducible deep learning research environment. It contains almost all popular deep learning frameworks such as theano, tensorflow, sonnet, pytorch, keras, lasagne, mxnet, cntk, chainer, caffe, and torch. The project is available on GitHub repository under the MIT license. Alibaba develops new machine learning services ET City Brain and ET Industrial Brain Alibaba Cloud has developed a set of machine learning-powered platforms, such as ET City Brain and ET Industrial Brain, to solve real world problems. The company said it is making this technology available to young minds to unleash their creativity and imagination to come with new solutions to old problems like “preventing a traffic jam from happening in the first place.” Min Wanli, chief data scientist and general manager of big data division at Alibaba Cloud, told the young minds: “Don’t worry about the hard-coded part. Imagine anything possible, and then use ET Brain to try and explore that.” Earlier this year, Alibaba released version 2.0 of its PAI machine learning service, which is integrated into its various ET Brain platforms. NTREIS makes Remine big data platform available to its members North Texas Real Estate Information System (NTREIS) said it has launched lead generation and big data platform Remine for its 35,000 members. The agreement was earlier announced in February this year. "At any given time, less than 2% of all properties are listed for sale in an MLS. That means that 98% all future opportunity is hiding off-market. Remine's predictive analytics make it easy to identify future buyers and sellers,” Mark Schacknies, CFO of Remine, said. With beta version of 3.0, Windocks releases data delivery platform based on Docker's Container Technology and SQL Server Containers Windocks has announced the Beta release of Windocks 3.0, a data delivery platform built on Docker’s container technology, with support for SQL Server containers.“Enterprise customers are asking for an alternative to expensive, complex solutions built on Solaris UNIX,” Windocks co-founder Paul Stanton said, “Windocks 3.0 delivers the first container native data delivery solution that fits any budget. Windocks empowers software developers and database administrators with tools to create, manage, and deliver data environments more simply, and affordably than ever. In a single step SQLServer DBAs create clonable images, and users self-service environments with one click on the Windocks web application.”

0
0
1564

article-image-cancer-detection-artificial-intelligence-ai

Abhishek Jha

30 Oct 2017

2 min read

Japanese scientists claim their AI system detects bowel cancer in less than a second

Abhishek Jha

30 Oct 2017

2 min read

In what could mark a major leap in cancer detection by artificial intelligence, researchers at Showa University in Yokohama, Japan, have developed an AI software that they claim can spot bowel cancer in less than a second. In a recently conducted trial, the AI system was successfully able to pinpoint potentially dangerous tumours from endoscopy images with clinical accuracy. Led by Dr. Yuichi Mori, the study involved 250 men and women in whom colorectal polyps had been detected using endocytoscopy. In total 306 polyps were assessed, and scientists used the AI program to predict the pathology of each polyp. The predictions were then compared with the final pathological report, and it was found that the system correctly detected 94% of cancers by matching each growth against over 30,000 images that were used for machine learning. What is remarkable is that it took the program less than a second to review each magnified endoscopic image and determine whether or not the polyp was malignant. “The most remarkable breakthrough with this system is that artificial intelligence enables real-time optical biopsy of colorectal polyps during colonoscopy, regardless of the endoscopists' skill,” Mori said. While the diagnostic system is yet to obtain the regulatory approval, Mori believes it could really help patients do away with needless surgeries. “This allows the complete resection of adenomatous (cancerous) polyps and prevents unnecessary polypectomy (removal) of non-neoplastic polyps,” he said. The findings were also presented at the ongoing United European Gastroenterology (UEG) Week in Barcelona, Spain. The research team is now working full throttle on this project, and they plan to take the study to a new level by developing an automatic polyp detection system. "Precise on-site identification of adenomas during colonoscopy contributes to the complete resection of neoplastic lesions" Mori added. "This is thought to decrease the risk of colorectal cancer and, ultimately, cancer-related death."

0
0
2031

Packt Editorial Staff

30 Oct 2017

4 min read

30th Oct.' 17 - Headlines

Packt Editorial Staff

30 Oct 2017

4 min read

Couchbase Server’s latest version, Intel’s partnership on storing cryptocurrency holdings, and more in today’s data science news. Couchbase in News Couchbase Server 5.0 released NoSQL technology Couchbase Server has announced its latest version 5.0. The new release focuses on agility, flexibility, and performance at scale. To improve customer experience, Couchbase said their 5.0 release provides the “first true Engagement Database.” Building on the Role Based Access Control (RBAC) security model introduced in version 4.5 for Administrators, Couchbase Server 5.0 introduces RBAC for applications. There are also a number of rich query performance optimizations, feature enhancements and new functionality in N1QL Query engine. The interface for Couchbase Server’s web console has been redesigned with a modern take and offers a streamlined interface to Couchbase administration and development platform that optimizes common tasks and workflows. Intel in News Intel, Ledger collaborate on cryptocurrency holdings storage system To bring new solutions for storing cryptocurrency holdings, Intel has entered into a partnership with Ledger, a virtual currency hardware startup firm. Under the collaboration, Ledger’s Blockchain Open Ledger Operating System (BOLOS) will be integrated into Intel’s Software Guard Extension (SGX) secure storage product line. As part of the deal, Intel and Ledger will focus on developing a so-called “enclave” wherein private keys are stored and where transactions are both generated and signed. The partnership is seen as an extension to Intel’s focus on hardware under its distributed ledger technology (DLT) strategy, with a plan to integrate the mining chips into Intel’s products like desktop personal computers. Deloitte and SAP in News Deloitte unveils roadmap for SAP Leonardo based Deloitte Reimagine Platform solutions Deloitte has announced its latest roadmap for new Deloitte Reimagine Platform solutions, that includes a wide-ranging pipeline of new use cases to support faster transformation leveraging blockchain, machine learning, IoT and advanced analytics. Deloitte Reimagine Platform solutions was developed through a co-innovation relationship with SAP and based on the SAP Leonardo digital innovation system. Enterprise leaders will be able to explore use cases in person Nov. 2-3, 2017, at SAP Leonardo Live in Chicago, where Deloitte will demonstrate a number of applications, including a sensor-enabled cold chain use case that can help businesses monitor and manage temperature and humidity changes when shipping sensitive products. Another use case — a "smart tap" solution — will showcase the use of liquid flow sensors to monitor and analyze marketing campaigns, trade promotions, and inventory management in real time. Beyond the roadmap, Deloitte plans to launch a global network of virtual studios focused on the Deloitte Reimagine Platform — to provide SAP customers with an up-close, in-person view of how they can modernize operations and innovate at scale with SAP Leonardo. Blockchain in News Blockchain wallet officially integrates Ethereum for iOS and Android Blockchain wallet has finally enabled support for Ethereum, according to a recent announcement. Ethereum users are now getting to avail all the Blockchain wallet functionalities, just like Bitcoin users. The integration is built in for both iOS and Android users. Blockchain-based AI project Poly AI will unveil Poly 1.0 in 2018 POLY AI, a developing project for artificial intelligence using Blockchain technology, has planned for the first generation of AI, Poly 1.0, in 2018, with supporting functions for the market such as pricing Bitcoin and trading supports. An ICO has been set to launch in this regard which will run in four different phases from Nov. 1 to Nov. 20. Only contributions in BTC currency will be accepted. Other data science news Seagate announces first drive for AI-powered video surveillance solution Seagate Technology plc has unveiled its SkyHawk AI hard disk drive (HDD), which is the first drive created specifically for artificial intelligence enabled video surveillance solutions. SkyHawk AI provides optimum bandwidth and processing power to manage data-intensive workloads, while simultaneously analyzing and recording footage from multiple HD cameras. “SkyHawk AI solutions will expand the design space for our customers and partners, allowing them to implement next-generation deep learning and video analytics applications,” said Sai Varanasi, vice president of product line management at Seagate Technology.

0
0
1689

article-image-cisco-google-hybrid-cloud-partnership

Abhishek Jha

27 Oct 2017

2 min read

What the Cisco, Google cloud partnership means for the tech world

Abhishek Jha

27 Oct 2017

2 min read

In technology, there are no strange bedfellows. Even at the height of their cold war, Microsoft and Apple did not abhor each other, working together on tech projects that rewarded mutual benefits – a term that sometimes also includes surviving against disruptions. It is therefore no surprise that a partnership between Cisco and Google goes beyond helping customers with more efficiency in hybrid cloud environment. It’s a win-win situation for the two Silicon Valley giants. Google wants to catch up to Amazon, and Cisco is facing a serious threat from cloud services (possibly an existential threat). At one time, Cisco had trumped Microsoft to become world's most valuable publicly traded company. It was Cisco whose networking hardware was used to build the internet. But today, companies that were once overly reliant on Cisco equipment are increasingly renting cloud services. Who won’t prefer the cloud after all, when you have AWS shielding you from all the heavy workloads in background! Add to it the recent tie up between VMware and Amazon, and the future looked bleak. With its software defined data center approach, VMware posed a definite scare. To tell the truth, Cisco was running out of partners. Remember EMC was a key Cisco partner at one time, but today Dell Technologies owns it. The game is complicated. Which is why Cisco and Google have joined forces. Both have pioneered different eras of internet, and promise to offer something different. With companies complaining about separate tools to manage on-premise software and those on cloud, the security problems definitely needed to be addressed. Adding Cisco networking and security software over Google programming technology will greatly help companies manage software services that run in their own data centers or in facilities operated by external cloud services. It brings agility and security in a hybrid ecosystem. Google will definitely benefit from Cisco’s long list of corporate customers – it being the elderly partner that has witnessed generations of change in IT. But the hurdle in beating Amazon is that it never slackens, and Google knows it has mountains to climb. Where it’s all cloudy at the peak.

0
0
1506

Packt Editorial Staff

27 Oct 2017

4 min read

27th Oct.' 17 - Headlines

Packt Editorial Staff

27 Oct 2017

4 min read

Anaconda version 5.0, Cisco-Google cloud partnership, NVidia Volta GPU on AWS, and more in today’s data science news. Anaconda in News Anaconda Distribution 5.0 released Data science distribution platform Anaconda has announced its version 5.0 where more than 100 packages have been added or updated. The new release offers wider scope of compatibility as it features all new compilers on macOS and Linux, along with more flexible dependency pinning of NumPy packages. Anaconda Distribution 5.0 is immediately available for download and installation. Alternatively, users can upgrade to version 5.0 by using conda update conda followed by conda install anaconda=5.0 Cisco and Google cloud collaboration in News Cisco, Google team to forge hybrid cloud partnership Cisco and Google are working together on an open hybrid solution that may help companies manage software services both on the Google Cloud as well as in their own data centers. The partnership will help customers enhance agility and security in a hybrid world, the companies said. It will set a complete environment that will develop, run, secure and monitor workloads, using which customers can improve their existing infrastructure and plan cloud migration well enough to prevent a lock-in. Google said in its official blog announcement that open source platforms Kubernetes and Istio will be at the forefront of the new architecture. “We’re working together to deliver a consistent Kubernetes environment for both on-premises Cisco Private Cloud Infrastructure and Google’s managed Kubernetes service, Google Container Engine. This way, you can write once, deploy anywhere and avoid cloud lock-in, with your choice of management, software, hypervisor and operating system,” Google said, adding that Istio will enable developers use policy-driven controls to scalably connect, help secure, discover and manage the applications. Nvidia Volta GPU in News Nvidia makes its Volta GPUs available through Amazon Cloud Amazon has beaten Google and Microsoft in the cloud race for providing Nvidia’s next generation GPU Volta through Amazon Web Services (AWS). Customers will be able to run instances with up to 8 V100 GPUs, which will initially be made available from AWS’ Northern Virginia, Oregon, Ireland, and Tokyo data centers. Other Data Science News Blue Canoe secures $1.4M to improve English speaking using machine learning Blue Canoe Learning, a new artificial intelligence startup that helps ESL speakers improve their English pronunciation, has raised an initial $1.4 million investment from Kernel Labs and others to expand its operations. Using machine learning and speech recognition, Blue Canoe has digitized the Color Vowel System and packaged it as an app where users (the non-native English speakers) play a card game and say the vocabulary word on the card; a machine learning system listens and identifies whether they have pronounced it correctly, and if not, gives relevant feedback. The startup will be getting its next few months of guidance and nurturing from the Allen Institute for AI. Voyomotive unveils Data Analytics GateWay to make available advanced vehicle data Voyomotive has launched an innovative program Data Analytics GateWay that will make advanced vehicle data available to industry partners, and enable development of the next generation of automotive applications. Voyomotive said the data provided by GateWay is typically not available from OEM or other aftermarket telematics systems and is ideally suited for AI, Machine Learning, Driverless Car, and app development. There is no cost to join GateWay, but the program is limited to corporate partners and software developers who can use their LinkedIn accounts for instant access or apply by filling out the GateWay Partner Application. eBay introduces deep learning-based image search capabilities for finding products using photos As originally announced in July, eBay has launched two new visual search tools, Image Search and Find it on eBay, to let online shoppers find items using photos from their phone or web. The new features make use of advancements in computer vision and deep learning, including the neural networks, eBay said. Pinterest, Google and Amazon already have visual search functionalities.

0
0
1377

Abhishek Jha

26 Oct 2017

3 min read

SciPy 1.0 is here: A brief history and perspective

Abhishek Jha

26 Oct 2017

3 min read

If there is one word that exclusively defines computing parlance, it’s the version. And that can be amusing if you are high on orthodox grammar. Because it takes 29 years for Windows to grow from version 1.01 to eventually the 10. So now that SciPy has released its version 1.0, developer community is abuzz with the question why the Python library took 16 years for such a nomenclature. In SciPy’s case the 1.0 version number was long overdue. Given the high quality code and documentation, and the stability and backwards compatibility, a 1.0 label was guaranteed. But the best in the business are always humble (read perfectionist). Despite being a mature and stable library that has been used in production settings for a long time, SciPy was reluctant in calling itself "1.0" because it believed it was not perfect, and that there were some dusty corners left. It is otherwise normal for open source projects to arrive with a 1.0 and proclaim "we are right up there." SciPy has a long history, during which it has matured as a software project. Largely been written by and for scientists, its development community has dramatically grown over the years. It has evolved from the same era when internet was just starting to bring together like-minded mathematicians and scientists. And many procrastinated their PhDs to write extension modules for this Python library – all this when email was how you helped a project improve, long before Github arrived with its "patch" collaborations and inputs. “The existence of a nascent Scipy library, and the incredible – if tiny by today's standards – community surrounding it is what drew me into the scientific Python world while still a physics graduate student in 2001,” says a nostalgic Fernando Perez who is a proud SciPy author, “Today, I am awed when I see these tools power everything from high school education to the research that led to the 2017 Nobel Prize in physics.” In SciPy 1.0, there are some major build improvements. Windows wheels are available on PyPI for the first time, and continuous integration has been set up on Windows and OS X in addition to Linux. It has a number of deprecations and API changes. But another standout statement from the release is the announcement of a formal governance structure. Now, SciPy consists of a BDFL (Benevolent Dictator For Life) and a Steering Committee. Pauli Virtanen is currently the BDFL. Reminiscing the timeline - 2001: the first SciPy release - 2005: transition to NumPy - 2007: creation of scikits - 2008: scipy.spatial module and first Cython code added - 2010: moving to a 6-monthly release cycle - 2011: SciPy development moves to GitHub - 2011: Python 3 support - 2012: adding a sparse graph module and unified optimization interface - 2012: removal of scipy.maxentropy - 2013: continuous integration with TravisCI - 2015: adding Cython interface for BLAS/LAPACK and a benchmark suite - 2017: adding a unified C API with scipy.LowLevelCallable; removal of scipy.weave - 2017: SciPy 1.0 release In any case, don't be fooled by the 1.0 number. The developer community that has contributed to and nurtured SciPy for nearly two decades will keep driving forward the project that has been the bedrock of modern scientific computing ecosystem. For as the current BDFL says, not long after 1.0 comes 1.1.

0
0
1601

article-image-trending-datascience-news-26th-oct-17-headlines

Packt Editorial Staff

26 Oct 2017

5 min read

26th Oct.' 17 - Headlines

Packt Editorial Staff

26 Oct 2017

5 min read

SciPy 1.0 release, Android 8.1 Developer Preview, SUSE’s Linux for SAP on IBM Cloud, and more in today’s data science news. SciPy in News SciPy 1.0 released Open source Python library SciPy has announced the release of its version 1.0, 16 years after its first version 0.1 was released in 2001. SciPy 1.0 has some major build improvements where . Windows wheels are available on PyPI for the first time, and continuous integration has been set up on Windows and OS X in addition to Linux. The new release, which has a number of deprecations and API changes, requires Python 2.7 or >=3.4 and NumPy 1.8.2 or greater. SciPy now also has a formal governance structure. It consists of a BDFL (Benevolent Dictator For Life) and a Steering Committee. Pauli Virtanen is currently the BDFL. Google Android 8.1 in News Android 8.1 Developer Preview: NNAPI to do “hardware acceleration” of machine learning, Google says Google has launched the developer preview of Android 8.1, where it has introduced a new ‘Neural Networks API’ (NNAPI) that can provide apps with “hardware acceleration” for on-device machine learning operations. Other than the NNAPI, there are few other updates and bug fixes for things like autofill and notifications. Android 8.1 will have two preview releases. While first release will be a "beta" with "final APIs," the second preview will provide "near-final system images for final testing" in November. The final release will then arrive sometime in December. Cloud Storage in News SUSE delivers Linux OS for SAP on IBM Cloud Starting fourth quarter of 2017, SUSE Linux Enterprise Server for SAP Applications will be available as an operating system for SAP solutions on the IBM Cloud. In addition, IBM Cloud is now a SUSE Cloud Service Provider, giving customers an open source platform using pay as you go model. SUSE Linux Enterprise Server for SAP Applications on the IBM Cloud will enable customers to quickly build, deliver and deploy business-critical workloads in SAP NetWeaver and SAP HANA in the cloud. Also, customers can integrate their SAP applications running on SUSE Linux Enterprise across different hardware platforms, including IBM Power, into a hybrid or private cloud deployment. Customers will benefit from IBM's global network of nearly 60 cloud data centers across six continents as well as access to the rich IBM Cloud catalog of services including AI, data and analytics, IoT, serverless and more. SAP updates Vora to further simplify cloud and hybrid data storage SAP has announced new improvements on its Vora solution, further simplifying its deployment on public cloud and making the migrations more flexible. The live customer cloud service on SAP Data Network can now use the distributed computing capabilities of SAP Vora. The updated version also supports Azure Data Lake (Azure is Microsoft’s public cloud). SAP Vora can now also load and distribute data files stored in Amazon S3 (Simple Storage Service). Apart from these, the latest release comes with an improved monitoring framework, support for Apache Spark2.x and optimizations for connectivity with the SAP HANA platform. SAP releases Data Hub SAP Data Hub is a solution that will help businesses tackle the complexity of their data systems and make use of the vast data gatherer from various sources. SAP Data Hub creates value across the diverse data landscape through data integration, data orchestration and data governance, as well as by creating powerful data pipelines that can accelerate positive business results. “The data hub is really a pipeline or data landscape management solution. It’s for customers who want to connect multiple data sources,” Director of Product Marketing at SAP Karen Sun said. Deep Learning AI Services in News HPE announces new deep-learning based AI platforms and services Hewlett Packard Enterprise (HPE) has unveiled new platforms and services tailored to facilitate the adoption of Artificial Intelligence. Within AI, the company will initially focus on deep learning. The new services include HPE Rapid Software Installation for AI, HPE Deep Learning Cookbook, HPE AI Innovation Center, and Enhanced HPE Centers of Excellence (CoE). BrainChip to demonstrate AI-powered video analytics technology at Milipol 2017 BrainChip Holdings announced that it will be exhibiting at Milipol 2017 in Paris Nov. 21-24. Organised by the French Ministry of Interior in partnership with other governmental bodies, Milipol Paris is one of the largest homeland security conferences, attracting over 24 thousand visitors from 143 countries. Inspector Jean-Francois Lespes, Chief of the Indictable Offense Department at the Toulouse National Police, will be sharing use cases of BrainChip technology and show how it helps in the investigation of major crimes. Inspector Lespes' organization recently completed a successful trial of BrainChip Studio. Bitcoin miner Bitmain announces new Deep Learning AI products Bitmain has launched deep-learning based artificial intelligence products, called BM1680 and SC1. The new applications are a customized tensor computing ASIC (Application Specific Integrated Circuit) that can be applied in a variety of use cases such as image and speech recognition, robotics, autonomous vehicle technology, security surveillance, IoT, and more. “Deep learning is very intensive computationally and our experience in creating high-performing hardware for Bitcoin has absolutely prepared us for this exciting area of computing,” said Bitmain CEO Micree Zhan. “AI hardware is an area that Bitmain is proactively developing to power the next generation of AI applications.” The hardware is fully compatible with popular AI platforms including mainstream Caffe, Darknet, Googlenet, VGG, Resnet, Yolo, Yoto2 and other models. Big Data As a Service in News BlueData and Networld enter partnership to deliver big-data-as-a-service in Japan BlueData and Networld Corporation have announced a distribution agreement under which Networld will promote, market, sell, deploy, and support BlueData EPIC software in Japan. "When VMware emerged as the leader in server virtualization, we worked with them to bring their technology to Japan. Now we're in the era of Big Data analytics, data science, and deep learning. The clear leader in bringing virtualization and containerization to the Big Data ecosystem is BlueData, and we are proud to partner with them in Japan," President and CEO of Networld Shoichi Morita said.

0
0
1574

article-image-neo4j-native-graph-platform

Abhishek Jha

25 Oct 2017

3 min read

From Graph Database to Graph Company: Neo4j's Native Graph Platform addresses evolving needs of customers

Abhishek Jha

25 Oct 2017

3 min read

In their own words, Neo4j is evolving from being “just a graph database company” to becoming a full-fledged graph technology platform including analytics, ETL and visualization. Today, at their GraphConnect conference in New York, Neo4j announced a new Native Graph Platform that will add analytics, data import and transformation, visualization, and discovery all atop its graph database. The announcement is not standalone. It comes alongside an open-source contribution to the Hadoop ecosystem. That Cypher, Neo4j’s graph query language, is now available for Apache Spark. It’s no secret that Neo4j wants to make Cypher the standard query language for graph, and now with all of the components in Native Graph Platform using Cypher, the new set of tools are sure to boost adoption. This is why it’s more of a strategic shift. And far beyond facilitating a switch from graph database to graph solutions. It is, in fact, going to dramatically expand Neo4j’s enterprise footprint by establishing relationships with a variety of new users and roles, including data scientists, big data experts, IT business analysts and line-of-business managers. The story about ‘evolving need of the customers’ is true. Today, customers do not deploy in isolation. We are living in a polyglot tech world with heterogeneous backends. Needs are bound to change. “Many companies started with us for retail recommendation engines or fraud detection, but now they need to drive their next generation of connected-data to power complex artificial intelligence applications," CEO Emil Eifrem says. "Our customers not only need a high performance, scalable graph database, they need algorithms to feed it, they need visualization tools to illustrate it, they need data integration to dive deeply into their data lakes,” Eifrem adds, hinting how the new Native Graph Platform would facilitate Neo4j’s ‘connections-first’ approach. Whether for increased revenue, fraud detection or planning for a more connected future, building networks of connected data proves to be the single biggest competitive advantage for companies today. This will become even more evident in the future as machine learning, intelligent devices and real-time activities like conversational commerce are all dependent on connections. Probably this is why Neo4j is extending the reach of its native graph stack, which has already seen success across multiple use cases with organizations ranging from NASA to eBay to Comcast. But what about the big giants like Oracle jumping into the competition? "When we got started, there was no one. Now, in past couple of years, everyone and their mom have released a graph database," Eifrem said. "The space is very much heating up." "There are two sides to it. Of course, when you have Oracle, SAP, Amazon and Microsoft all announcing that they're going to your space [it means] we're up against, from our perspective, infinite resources – and that is scary." Yet, Neo4j is not scared. The crucial thing, according to Eifrem, is that the continued awareness has brought graph technology to the mainstream. And that is where Neo4j sees more opportunity than threats. “We don't have the biggest microphone in the world. We've stood alone on this mountain for the longest time, and now we have some really powerful voices joining in. That's 10 times more important than losing the occasional deal because Oracle had a lock-in on that customer.”

0
0
2256

article-image-dr-brandon-explains-nlp-natural-language-processing-jon

Aarthi Kumaraswamy

25 Oct 2017

5 min read

Dr.Brandon explains NLP (Natural Language Processing) to Jon

Aarthi Kumaraswamy

25 Oct 2017

5 min read

[box type="shadow" align="" class="" width=""] Dr.Brandon: Welcome everyone to the first episode of 'Date with data science'. I am Dr. Brandon Hopper, B.S., M.S., Ph.D., Senior Data Scientist at BeingHumanoid and, visiting faculty at Fictional AI University. Jon: And I am just Jon - actor, foodie and Brandon's fun friend. I don't have any letters after my name but I can say the alphabets in reverse order. Pretty cool, huh! Dr.Brandon: Yes, I am sure our readers will find it very amusing Jon. Talking of alphabets, today we discuss NLP. Jon: Wait, what is NLP? Is it that thing Ashley's working on? Dr.Brandon: No. The NLP we are talking about today is Natural Language Processing, not to be confused with Neuro-Linguistic Programming. Jon: Oh alright. I thought we just processed cheese. How do you process language? Don't you start with 'to understand NLP, we must first understand how humans started communicating'! And keep it short and simple, will you? Dr.Brandon: OK I will try my best to do all of the above if you promise not to doze off. The following is an excerpt from the book Mastering Machine Learning with Spark 2.x by Alex Tellez, Max Pumperla and Michal Malohlava. [/box] NLP helps analyze raw textual data and extract useful information such as sentence structure, sentiment of text, or even translation of text between languages. Since many sources of data contain raw text, (for example, reviews, news articles, and medical records). NLP is getting more and more popular, thanks to providing an insight into the text and helps make automatized decisions easier. Under the hood, NLP is often using machine-learning algorithms to extract and model the structure of text. The power of NLP is much more visible if it is applied in the context of another machine method, where, for example, text can represent one of the input features. NLP - a brief primer Just like artificial neural networks, NLP is a relatively "old" subject, but one that has garnered a massive amount of attention recently due to the rise of computing power and various applications of machine learning algorithms for tasks that include, but are not limited to, the following: Machine translation (MT): In its simplest form, this is the ability of machines to translate one language of words to another language of words. Interestingly, proposals for machine translation systems pre-date the creation of the digital computer. One of the first NLP applications was created during World War II by an American scientist named Warren Weaver whose job was to try and crack German code. Nowadays, we have highly sophisticated applications that can translate a piece of text into any number of different languages we desire!‌ Speech recognition (SR): These methodologies and technologies attempt to recognize and translate spoken words into text using machines. We see these technologies in smartphones nowadays that use SR systems in tasks ranging from helping us find directions to the nearest gas station to querying Google for the weekend's weather forecast. As we speak into our phones, a machine is able to recognize the words we are speaking and then translate these words into text that the computer can recognize and perform some task if need be. Information retrieval (IR): Have you ever read a piece of text, such as an article on a news website, for example, and wanted to see similar news articles like the one you just read? This is but one example of an information retrieval system that takes a piece of text as an "input" and seeks to obtain other relevant pieces of text similar to the input text. Perhaps the easiest and most recognizable example of an IR system is doing a search on a web-based search engine. We give some words that we want to "know" more about (this is the "input"), and the output are the search results, which are hopefully relevant to our input search query. Information extraction (IE): This is the task of extracting structured bits of information from unstructured data such as text, video and pictures. For example, when you read a blog post on some website, often, the post is tagged with a few keywords that describe the general topics about this posting, which can be classified using information extraction systems. One extremely popular avenue of IE is called Visual Information Extraction, which attempts to identify complex entities from the visual layout of a web page, for example, which would not be captured in typical NLP approaches. Text summarization (darn, no acronym here!): This is a hugely popular area of interest. This is the task of taking pieces of text of various length and summarizing them by identifying topics, for example. In the next chapter, we will explore two popular approaches to text summarization via topic models such as Latent Dirichlet Allocation (LDA) and Latent Semantic Analysis (LSA). If you enjoyed the above excerpt from the book Mastering Machine Learning with Spark 2.x by Alex Tellez, Max Pumperla, and Michal Malohlava, check out the book to learn how to Use Spark streams to cluster tweets online Utilize generated models for off-line/on-line prediction Transfer learning from an ensemble to a simpler Neural Network Use GraphFrames, an extension of DataFrames to graphs, to study graphs using an elegant query language Use K-means algorithm to cluster movie reviews dataset and more

0
0
2058

Tech News - Data

3rd Nov.' 17 - Headlines

World's first AI exchange-traded funds could unleash a new era of automated trading systems

2nd Nov.' 17 - Headlines

Sony resurrects robotic pet Aibo with advanced AI

1st Nov.' 17 - Headlines

Dr. Brandon explains Word Vectors (word2vec) to Jon

31st Oct.' 17 - Headlines

Japanese scientists claim their AI system detects bowel cancer in less than a second

30th Oct.' 17 - Headlines

What the Cisco, Google cloud partnership means for the tech world

Trending Topics

27th Oct.' 17 - Headlines

SciPy 1.0 is here: A brief history and perspective

26th Oct.' 17 - Headlines

From Graph Database to Graph Company: Neo4j's Native Graph Platform addresses evolving needs of customers

Dr.Brandon explains NLP (Natural Language Processing) to Jon