Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds

Tech News - Data

1208 Articles
article-image-space-final-frontier-ai-nasa-kepler-latest-ai-backed-discovery
Savia Lobo
12 Dec 2017
4 min read
Save for later

Is space the final frontier for AI? NASA to reveal Kepler's latest AI backed discovery

Savia Lobo
12 Dec 2017
4 min read
Artificial Intelligence is helping NASA in their expedition! NASA’s Kepler, the planet hunting telescope, has made a brand new discovery. To announce this recent breakthrough, NASA will be holding a major press conference this Thursday afternoon. Kepler, launched in 2009, has discovered numerous planets (exoplanets) outside our solar system, some of these exoplanets could also support life. In the year 2014, the telescope began a major mission called K2, which hunts for more exoplanets while studying other cosmic phenomena. The Kepler mission has previously come up with huge discoveries, and has surprisingly estimated that the universe includes numerous planets that could support life. This recent breakthrough was carried out with the help of Google’s Artificial intelligence, which analyzes the data coming from Kepler. With the help of Google’s AI, NASA expects to cut down the time required to choose a planet, which shows possibility of life. This is nothing short of the stuff they show on Christopher Nolan’s famous movie Interstellar, where Joseph Coop, an astronaut, goes in search of other planets to colonize as sustaining life on Earth becomes increasingly difficult. After decades of research and exploratory missions to far flung planets the research team arrives at 3 potential candidate planets for most likely suitable for human life. An AI backed Kepler could have saved that team time in narrowing down the list of planets with potential to find life. In short, the AI backing will ease out tasks for the scientists by allowing them to brush through the data sent by the telescope and easily pick up planets that might seem interesting to them for further exploration. NASA said that Paul Hertz - director of NASA’s astrophysics division, Christopher Shallue - a senior Google software engineer, Jessie Dotson - Kepler project scientist, NASA’s Ames Research Center in Silicon Valley, California, and two scientists would be part of the upcoming conference. Very little is known about the conference itself as NASA is tight-lipped about this press release headlined cryptically as, “NASA’s press release states that machine learning “demonstrates new ways of analyzing Kepler data.” Let’s keep our enthusiasm intact for the disclosure of Kepler’s latest breakthrough this Thursday...To watch the conference live on Thursday, stream into NASA’s official website. Until then, here are some ways we think AI could assist humans with space exploration: Processing Data collected in space: The ENVISAT satellite collects 400 terabytes of data per year. As time advances, the count may reach 720 terabytes a day. In order to process this data, scientists have created a network of computers. Each computer receives a small data packet and processes it with the help of AI, before regrouping the data packets back together. Using this data, scientists would be able to track the activities taking place with respect to the Earth’s atmosphere, keep track of solar activity, and so on. Rapid communications to and fro the earth:  While talking to astronauts or computers orbiting around the earth, it takes less than a second for a radio wave to send signals to Earth from the International Space Station (ISS). Also, the time delay is different for different satellites and planets. Thus it isn’t feasible to relay commands for each action from the Earth. With the help of AI we can have on-board computers, which will think for themselves and adjust the communication time frame accordingly. Machines will walk through the planet to see if the possibility of life exists: Not all planets are suitable for humans to walk on. And even if they were, given the journey time for a mission, it may not always be practical to have humans do the on-ground study themselves. For instance, the moons of Jupiter--specifically Ganymede and Europa are interesting places to find life, due to the presence of vast liquid water oceans. However, the intense radiation fields around Jupiter are sterile for human survival. Hence, expeditions to find possibility of life on planets other than the Earth, can be carried out by machines with the help of AI. NASA is in the process of developing a robot called Robonaut for the International Space Station. Eventually, the expectation is that Robonaut would carry out risky spacewalks while astronauts manage it from the safety of inside the space station. There are many other ways how AI can be used to guide our space explorations. Scientists are still in the process of finding out unique ways in which AI can assist them in their space missions.  
Read more
  • 0
  • 0
  • 1949

article-image-microsoft-previews-quantum-computing-development-kit-q-programming-language-quantum-simulator
Abhishek Jha
12 Dec 2017
8 min read
Save for later

“The future is quantum” — Are you excited to write your first quantum computing code using Microsoft’s Q#?

Abhishek Jha
12 Dec 2017
8 min read
The quantum economy is coming “I've seen things you people wouldn't believe” – Few people loved to hear what the replicant Roy Batty (played by Rutger Hauer) had to say in its legendary death speech. But while the 1982-released Blade Runner showed us a world where everything that could go wrong had gone wrong, the monologue retains its level of disbelief in 2017. Albeit from the other end. The promise of an astonishing future has a positive undertone this time! If the artificial intelligence, the Internet of Things, and the Self-driving cars were not all, the big daddy of all, ‘Quantum Computing’ is underway. It’s not yet a fad, but craze for such a marvel abounds. Every inch we move towards Quantum Computing (it’s acceleration, stupid!) the future looks stupefying. And now with Microsoft releasing its own quantum programming language and a development kit, it’s one hell of an opportunity to live in a time when quantum computing nears a possibility. Which is a different ball game. The moment you hear about quantum computing, you forget about linear algebra. [embed width="" height=""]https://www.youtube.com/watch?v=doNNClTTYwE[/embed] A giant leap forward, Quantum Computing is set to alter our economic, industrial, academic, and societal landscape forever. In just hours or days, a quantum computer can solve complex problems that would otherwise take billions of years for classical computing to solve. This has massive implications for research in healthcare, energy, environmental systems, smart materials, and more. What is inside Microsoft’s quantum development kit? Microsoft had already announced its plans to release a new programming language for quantum computers at its Ignite Conference this year. That time the company said the launch might come sometime by the end of 2017. That day has come. And Microsoft is previewing a free version of its Quantum Development Kit. The kit includes all of the pieces a developer needs to get started including a Q# programming language (yesteryears programmers like me will pronounce it “Q Sharp”) and compiler, a Q# library, a local quantum computing simulator, a quantum trace simulator and a Visual Studio extension. So, basically the preview is aimed at early adopters who want to understand what it takes to develop programs for quantum computers. Introducing Q# Microsoft describes Q# as “a domain-specific programming language used for expressing quantum algorithms. It is to be used for writing sub-programs that execute on an adjunct quantum processor, under the control of a classical host program and computer.” If you remember, there was a statement from Satya Nadella at the Ignite announcement that while developers could use the proposed language on classical computers to try their hand at developing quantum apps, in future, they will be writing programs that actually run on topological quantum computers. Consider this as the unique selling point of Q#! “The beauty of it is that this code won’t need to change when we plug it into the quantum hardware,” Krysta Svore, who oversees the software aspects of Microsoft’s quantum work, said. And just in case you wish to learn how to program a quantum computer using Q# language, you’d find yourself at home if you’re acquainted with Microsoft Visual Studio. Q# is “deeply integrated” with the same. Besides, Q# has several elements of C#, Python, and F# engrained along with new features specific to quantum computing. Quantum Simulator Part of Microsoft’s development kit is quantum simulator that will allow developers to figure out if their algorithms are actually feasible and can run on a quantum computer. It lets programmers test the software on a traditional desktop computer or through its Azure cloud-computing service. You can simulate a quantum computer of about 30 logical qubits on your laptop (so, you don’t have to rely on some remote server). If you simulate more than 40 logical qubits, you can use an Azure-based simulator. Remember Microsoft is competing with the likes of Google and IBM to create real-life quantum computers that are more powerful than a handful of qubits. So the simulator allowing developers to test programs and debug code with their own computers is necessary, since there really aren't any quantum computers for them to test their work on yet. When Microsoft would be able to create a general-purpose quantum computer, the applications created via this kit would be supported. B y offering the more powerful simulator – one with over 40 logical qubits of computing power – through its Azure cloud, Microsoft is somehow giving a hint that it envisions a future where customers use Azure for both classical and quantum computing. New tutorials and libraries In addition to Q# programming language and the simulator, the development kit includes companion collection of documentation, libraries and sample programs. A number of tutorials and libraries are supplied to help developers experiment with the new paradigm. This may help them get a better foothold on the complex science behind quantum computing, and develop familiarity with aspects of computing that are unique to quantum systems, such as quantum teleportation. That’s a method of securely sharing information across quantum computing bits, or qubits, that are connected by a quantum state called entanglement. “The hope is that you play with something like teleportation and you get intrigued,” Krysta said. Microsoft is using a ‘different design’ for its topological quantum computer Microsoft is still trying to build a working machine. But it is using a very different approach that will make its technology less error-prone and more suitable for commercial use. The tech pioneer is pursuing a novel design based on controlling an elusive particle called a Majorana fermion, a concept that was almost unheard of. Engineers have almost succeeded in controlling the Majorana fermion in a way that will enable them to perform calculations, Todd Holmdahl, head of Microsoft’s quantum computing efforts, said, adding that Microsoft will have a quantum computer on the market within five years. These systems push the boundaries of how atoms and other tiny particles work. While traditional computers process bits of information as 1s or 0s, quantum machines rely on "qubits" that can be a 1 and a zero at the same time. So two qubits can represent four numbers simultaneously, and three qubits can represent eight numbers, and so on. This means quantum computers can perform calculations much faster than standard machines and tackle problems that are way more complex. Theoretically, a topological quantum computer is designed in a way that will create more stable qubits. This could produce a machine with an error rate from 1,000 to 10,000 times better than computers other companies are building, according to Holmdahl, who led the development of Xbox and the company's HoloLens goggles. Researchers have only been able to keep qubits in a quantum state for fractions of a second. When qubits fall out of a quantum state they produce errors in their calculations, which can negate any benefit of using a quantum computer. The lower error rate of Microsoft's design may mean it can be more useful for tackling real applications – even with a smaller number of qubits – perhaps less than 100. Interestingly, Krysta said that her team has already proven mathematically that algorithms that use a quantum approach can speed up machine learning applications substantially – enabling them to run as much as 4,000 times faster. The future is quantum Make no mistake. The race for quantum computing is already flared up. To an extent that rivals Google and IBM are competing to achieve what they call quantum supremacy. At the moment, IBM holds the pole position with its 50 qubit prototype (at least for now until Google reveals its cards). But with Microsoft coming up with its own unique architecture, it’s difficult to underplay Redmond’s big vision. During its Ignite announcement, the company stressed on a “comprehensive full-stack solution” for controlling the quantum computer and writing applications for it. That means it is in no hurry. “We like to talk about co-development,” Krysta had said. “We are developing those [the hardware and software stack] together so that you’re really feeding back information between the software and the hardware as we learn, and this means that we can really develop a very optimized solution.” The technology is still undergoing a long research phase, but the prospects are brighter. Brought online, quantum computing could singularly transform unreal things right into the real world use cases. Going from “a billion years on a classical computer to a couple hours on a quantum computer” has taken decades of research. And unlike the Blade Runner, all those moments will not be lost in time like tears in the rain.
Read more
  • 0
  • 0
  • 2961

article-image-12th-dec-17-headlines
Packt Editorial Staff
12 Dec 2017
5 min read
Save for later

12th Dec.' 17 - Headlines

Packt Editorial Staff
12 Dec 2017
5 min read
NASA's breakthrough using Google's machine learning, Kubeflow machine learning toolkits on Kubernetes, Lexalytics ‘words-first’ AI for natural language processing, and more in today's top stories in AI, machine learning and data science news. Artificial Intelligence for Space Exploration.. Google's machine learning may drive NASA's next breakthrough announcement NASA could announce a major breakthrough discovery from its alien-hunting Kepler telescope, driven by Google’s machine-learning artificial intelligence software. The press conference will take place Thursday, December 14, at 1 pm EST, and will be live-streamed on NASA’s website. Experts from NASA and Google will be present to explain the latest breakthrough, and the attendees include Paul Hertz, the director of NASA’s astrophysics division, and Christopher Shallue, senior research software engineer at Google Brain—the tech giant’s machine intelligence research team. NASA's Kepler space telescope has been searching for habitable planets since 2009. Has the futures contract legitimized Bitcoin? Bitcoin price steadies, albeit a little, as first futures contracts begin trading on Cboe exchange Following the launch of the Cboe XBT tradable bitcoin futures contracts on the Cboe Global Markets exchange, the price of bitcoin steadied in trading to some extent. The price fluctuated a bit for a while, but upon XBT’s debut, the price rose to around $15,800 on the influence of futures contracts (at last close the price was 16765.99 US Dollar). A futures contract is a tradable legal agreement that entitles its holder to buy or sell a particular commodity or financial instrument at a predetermined price at a specified time in the future. In terms of bitcoin, that gives a contract holder the ability to buy or sell bitcoin at a set price in the future. That means that in theory, it takes away some of the speculation surrounding bitcoin by locking in projected future prices. Still some industry experts are warning that bitcoin could just be in the middle of a 'bubble.' They say that not only will the bitcoin bubble burst in 2018, its crash will also impact the global economy. Microsoft cruising through quantum computing Microsoft releases quantum computing development kit preview Microsoft has released a preview of its quantum computing development kit. The kit includes all of the pieces a developer needs to get started including a Q# language and compiler, a Q# library, a local quantum computing simulator, a quantum trace simulator and a Visual Studio extension. The preview is aimed at early adopters who want to understand what it takes to develop programs for quantum computers.  Earlier Microsoft had said in September that it plans to offer a "comprehensive full-stack solution" for controlling the quantum computer and writing applications for it. The race for quantum computing has heated up after IBM announced a 50 qubit prototype. Kubeflow: Bringing together Kubernetes and machine learning Introducing Kubeflow to bring composable, easier to use stacks with more control and portability for Kubernetes deployments for all ML, not just TensorFlow. A new project, Kubeflow, has been announced to make machine learning on Kubernetes easy, portable, and scalable. Kubeflow lets the system take care of the details and support the kind of tooling ML practitioners want and need. Kuberflow helps users in having an easy to use ML stack anywhere Kubernetes is already running. Plus, it should self-configure based on the cluster it deploys into. While Kubeflow contains support for creating JupyterHub, Kubeflow users can also create a TensorFlow Training Controller for configuring CPUs or GPUs. It also helps adjust the size of a cluster with a single setting. How is it better than a plain Docker image over Kubernetes? Well, first of all, Kubeflow is great for anyone already using Kubernetes. And then it also brings scalability to people with existing on premise or cloud-based servers. In general, if you’re wiring together 5 or more services and systems to create a ML stack, then Kubeflow should simplify your workload. Read more about Kubeflow here on GitHub. Lexalytics pioneers ‘words-first’ AI Lexalytics launches new pipeline for building machine learning-based, artificial intelligence applications for natural language processing Lexalytics has launched a new machine learning platform, Lexalytics AI Assembler, to facilitate seamless insights from the huge amount of natural language data surrounding the enterprise. With AI Assembler, Lexalytics is now providing enterprise customers with the same tools to accomplish tasks like raising the accuracy of Named Entity Recognition by 25 percent in a fraction of the time it would take with standard industry technologies. Lexalytics is also launching a limited-availability beta release of Semantria Storage & Visualization, a content storage, aggregation, search and reporting framework that provides business analysts and marketers a single access point to interact with their data. “With today’s announcement, Lexalytics is pioneering the field of ‘words-first’ AI, offering our customers the same machine learning tools we use internally to power our text and sentiment analysis platforms, along with easy-to-use data storage and visualization tools to make the most of their data,” said Jeff Catlin, CEO of Lexalytics. For more details on AI Assembler, visit https://www.lexalytics.com/assembler.
Read more
  • 0
  • 0
  • 1316
Visually different images

article-image-quant-network-launches-overledger-blockchain-for-cross-blockchain-data-interoperability
Abhishek Jha
11 Dec 2017
5 min read
Save for later

"The Blockchain to Fix All Blockchains" – Overledger, the meta blockchain, will connect all existing blockchains

Abhishek Jha
11 Dec 2017
5 min read
Throughout the ‘80s and ‘90s, internet inventers harped on using cryptography to solve the problems of security, privacy, and inclusion. The quest led to Nick Szabo proposing in 1998 what was called “The God Protocol” where all parties would send inputs to the ‘most trustworthy third party’ that was imaginable. A decade later, the global financial markets collapsed, and digital currencies started taking over. Cryptocurrencies, after all, promised to ensure integrity of the data exchanged among billions of devices without going through a trusted third party! While Nick’s concept of the ideal protocol may or may not be a direct precursor to the present date bitcoin architecture, its underlying technology blockchain is claimed to have state-of-the-art cryptography. There is no central database to be hacked, and the data is double-encrypted. Perhaps the internet pioneers have got their trust protocol. Or may be not. An overwhelming number of blockchains that have flooded the financial markets of late have failed. Enter Overledger. Just another blockchain.. but could link all the other blockchains. In that, it’s a first. It’s the brainchild of London-based Quant Network. “The uniqueness of our running machine is that Overledger is not every other blockchain,” Quant Network Chief Strategist Paolo Tasca said. “We do not impose new consensus mechanisms, new gateways, adapters or special validating nodes on top of existing blockchains. Overledger is a virtual blockchain that links existing blockchains and allows developers to build multi-chain applications (or in other terms blockchain-agnostic applications).” The idea is to facilitate data interoperability across different blockchains in a manner something similar to TCP/IP which propelled the internet. Just that unlike the internet of information, we are talking about the internet of value or money. Image courtesy of Quant Project To some, Overledger seems to be a straightforward extension of the atomic swap concept. But unlike atomic swaps that only supports currencies, Overledger works for any data that can be put on the blockchain. Gilbert Verdian, CEO and Co-founder of Quant Network, confirmed that a patent for Overledger technology was indeed filed in the first week of December. According to Verdian, Quant Network is focussing on three goals: developing an API to connect the world’s networks to multiple blockchains; bridging existing networks (e.g financial services) to new blockchains; developing a new Blockchain Operating System with a protocol and a platform to create next-generation, multi-chain applications. Promoting the Blockchain ISO Standard Verdian pioneered the development of Blockchain ISO Standard TC 307, which will allow for interoperability, governance and reference architecture of blockchain technologies to work between different blockchains. "There is no one blockchain standard or protocol currently in use,” Verdian commented. “International standards will allow for interoperability and implementation and use of multiple blockchain-related protocols." Currently, there are 40 countries and organizations, such as the European Commission, working on developing the Blockchain ISO Standard, and the timeline is to have a published Standard in 2020. "Quant Overledger will be compatible to the Blockchain ISO Standard when it is released, allowing a gateway to 'talk' a common language to other networks and existing systems such as financial services networks," Verdian said. "The entry and exit points of Overledger will be compatible to the ISO Standard, which any other technology vendor can also implement in future." Benefits of Interoperability According to Verdian, the widespread adoption and use of international blockchain standards could facilitate a new wave of innovation, productivity, employment and industry opportunities. For example, the growing burden of KYC compliance could be reduced through the development of international blockchain standards which utilize shared databases for undertaking business and transacting payments. The development of international standards to support smart contracts has the potential to decrease contracting, compliance and enforcement provision costs. Similarly, the development of international blockchain standards could reduce transaction costs for SMEs when dealing with governments and businesses. "Quant will completely change how people will be able to interact with blockchains in a way that's not possible today," Verdian noted. "A good example is the recognition of a person's identity by one entity on a blockchain will be recognized and understood by every other blockchain and every entity connected to those." Dapp Development There are plans for a Quant App Store that will allow developers and startups to create multi-chain applications on top of Overledger and monetize their applications in unique ways, without having to rely on capabilities of only one blockchain. "As a company, we're also planning to release distributed applications on top of Quant in the areas of RegTech, FinTech and HealthTech," Verdian said, adding that by allowing businesses to directly interact with multiple blockchains, they will be better able to cope with the modern supply chain complexities. An Initial Coin Offering will be launched in February in this regard. Before that, Tokens will be sold in a pre-ICO in January. Developers will be able to publish distributed apps on the Quant Network store and optionally monetize their apps by charging usage fees in the Tokens. Quant Network plans to release the first versions of Overledger in Q1 2018 and finalize the SDK and libraries in Q3 2018. This will be an open source and freely available software release that developers and enterprises will be able to use for creating next generation multi-chain applications. The Quant App Store is then slated to be released at the end of 2018 for developers to publish their apps and earn Quant tokens. Just to reiterate, Overledger is not just another blockchain; it connects all the existing blockchains. The new platform enables the great grand reconciliation of all digital transactions (just about everything) in real time. In the days to come, billions of ‘smart’ things in the physical world will be sensing, responding, communicating, and sharing important data in all the fields right from environment protection to health. This Internet of Everything certainly needs a Ledger of Everything. In that context, a meta blockchain is not a bad idea. Such a seamless communication across multiple blockchains will also allow businesses to handle modern supply chain complexities better.
Read more
  • 0
  • 0
  • 3542

article-image-alphazero-genesis-machine-intuition
Savia Lobo
11 Dec 2017
3 min read
Save for later

AlphaZero: The genesis of machine intuition

Savia Lobo
11 Dec 2017
3 min read
Give it four days to practice and you would have a chess master ready!  This line stands true for Deepmind’s latest AI program, AlphaZero. AlphaZero is an advanced version of AlphaGo Zero--the AI that recently won all games of Go against its precursor AlphaGo--relies simply on self-play without any example games. AlphaZero is an improvement to it as it shows that the same program can master three different types of board games, Chess, Shogi and Go namely. It uses reinforcement learning algorithm to achieve state-of-the-art results. AlphaZero mastered the game of chess, without having prior domain knowledge of the game, except the game rules. Additionally, it also mastered Shogi, a Japanese board game, as showcased in a recent DeepMind research paper. Demis Hassabis, founder, and CEO, DeepMind introduced some additional details of AlphaZero at the Neural Information Processing Systems (NIPS) conference in Long Beach, California. “It doesn’t play like a human, and it doesn’t play like a program, it plays in a third, almost alien, way,” said Hassabis. It only took four hours to self-play and create chess knowledge beyond any human or computer program. Surprisingly, it defeated Stockfish 8 (A world champion chess engine) in four hours without any external help or any prior empirical data (a database of archived chess games, or well-known chess strategies and openings). The hyper-parameter of AlphaGo Zero’s search was tuned by using Bayesian optimization algorithm. AlphaZero reuses the same hyper-parameter for playing all the board games without performing any game-specific tuning. Similar to AlphaGo Zero, AlphaZero’s board state is encoded by spatial planes based on specifically the basic rules for each game. While training AlphaZero, the same algorithmic settings, network architecture, and hyper-parameters were used in all three games. A separate instance of AlphaZero was trained for each game. The training initiated for 700,000 steps (mini-batches of size 4,096) starting from randomly initialized parameters, with 5,000 first-generation TPUs to generate self-play games and 64 second-generation TPUs to train the neural networks. After comprehensive analysis, it was found that AlphaZero outperformed Stockfish in Chess in 4 hours Elmo in Shogi in less than 2 hrs AlphaGo Lee in Go in 8 hours The achievements by AlphaZero are impressive, to say the least. Researchers at DeepMind say that it still needs to play many more practice games than a human chess champion. Human learning is based on watching other people play and also by learning in different ways, which a machine cannot achieve. But it can go beyond human thinking by expanding the capabilities of its program. To know more about how AlphaZero masters chess and Shogi using Reinforcement algorithm, you can have a look at the research paper here or tune into the game series on Youtube to watch the video.
Read more
  • 0
  • 0
  • 3428

article-image-11th-dec-17-headlines
Packt Editorial Staff
11 Dec 2017
5 min read
Save for later

11th Dec.' 17 - Headlines

Packt Editorial Staff
11 Dec 2017
5 min read
DeepMind AlphaZero beating best chess bot, Numba 0.36, Apple's machine learning framework Turi Create, and Gensim 3.2.0 among today's top stories in machine learning, artificial intelligence and data science news. DeepMind's AlphaZero is now the most dominant chess playing entity on the planet! Google's self-learning AI Alpha Zero teaches itself chess from scratch in four hours, and beats previous champion A few months after demonstrating its dominance over the game of Go, DeepMind’s AlphaZero AI has trounced the world’s top-ranked chess engine—and it did so without any prior knowledge of the game and after just four hours of self-training.  In a one-on-one tournament against Stockfish 8, the reigning computer chess champion, the DeepMind-built system didn’t lose a single game, winning or drawing all of the 100 matches played. AlphaZero is a modified version of AlphaGo Zero, the AI that recently won all 100 games of Go against its predecessor, AlphaGo. The system works nearly identically to AlphaGo Zero, but instead of playing Go, the machine is programmed to play chess and shogi. Shape of things to come: IoT heralds 'smart' gymming! Practix using IoT tracking device in its Gyms for real-time data analytics Practix said it has developed an activity tracking system for gyms, using an IoT tracking device with real-time data analytics done by AI and machine learning algorithms. The Berlin-based group offers the gym customers automatic, real-time logging of their workout and data-based analytics and metrics for their workout. Gym members receive a wristband which can connect to any current gear in the gym through scanning Practix's NFC patch. The wristband tracks user’s workout data and runs it through their algorithms. The output is presented through the Practix app and website where both gym operator and the customer can see the detailed information about their workout. Numba 0.36 released Numba 0.36.1 announced with LLVM 5, the stencil decorator, and built with Anaconda Distribution 5 compilers Numba 0.36.1 has been released with new features and some fixes to user-reported bugs (version 0.36.0 was never released). Numba has been upgraded to require llvmlite 0.21, which increases the required LLVM version to 5.0 resulting in minor improvements to code generation, especially for AVX-512. LLVM 5 also adds support for AMD Ryzen CPUs. Whereas a new compiler decorator has been introduced in this release: stencil. Similar to vectorize, it allows to write a simple kernel function that is expanded into a larger array calculation. According to the developers, stencil is for implementing "neighborhood" calculations, like local averages and convolutions. The kernel function accesses a view of the full array using relative indexing (i.e. a[-1] means one element to the left of the central element) and returns a scalar that goes into the output array. The ParallelAccelerator compiler passes can also multithread a stencil the same way they multithread an array expression. "The current @stencil implements only one option for boundary handling, but more can be added, and it does allow for asymmetric neighborhoods, which are important for trailing averages," the developers said. Also, since Anaconda has started using custom builds of GCC 7.2 (on Linux) and clang 4.0 (on OS X) to build conda packages in order to ensure the latest compiler performance and security features were enabled (even on older Linux distributions like CentOS 6), developers have migrated the build process for Numba conda packages on Mac and Linux over to these compilers for consistency with the rest of the distribution. "When doing AOT compilation in Numba, it uses the same compiler that was used for NumPy, so on Anaconda it will remind you to install the required compiler packages with conda," Numba team said. Apple's ambitious foray into ML, AI.. Apple open sources 'Turi Create' machine learning framework on Github After acquiring machine learning startup Turi last year, Apple has now created a new machine learning framework called Turi Create. The tech giant has shared the framework on Github. According to Apple, Turi Create is designed to simplify the development of custom machine learning models. Apple says Turi Create is easy to use, has a visual focus, is fast and scalable, and is flexible. Turi Create is designed to export models to Core ML for use in iOS, macOS, watchOS, and tvOS apps. With Turi Create, developers can quickly build a feature that allows their app to recognize specific objects in images. Doing so takes just a few lines of code. Turi Create covers several common scenarios including recommender systems, image classification, image similarity, object detection, activity classifier, and text classifier. Gensim's "Christmas Come Early" Gensim 3.2.0 released: new Poincare embeddings, speed up of FastText, pretrained models for download, Linux/Windows/MacOS wheels and performance improvements Gensim has announced the release of its version 3.2.0 (with a codename Christmas Come Early). The new version comes with pre-trained models for download and implements Poincaré embeddings. FastText has been significantly optimized with fast multithreaded implementation, natively in Python/Cython. The release deprecates the existing wrapper for Facebook’s C++ implementation.There are also binary pre-compiled wheels for Windows, OSX and Linux. Users no longer need to have a C compiler for using the fast (Cythonized) version of word2vec, doc2vec, fasttext etc. Gensim 3.2.0 also adds DeprecationWarnings to deprecated methods and parameters, with a clear schedule for removal. There are other performance improvements and bug fixes, the details of which are available on GitHub.
Read more
  • 0
  • 0
  • 1204
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at £15.99/month. Cancel anytime
article-image-week-glance-2nd-dec-8th-dec-17-top-news-data-science
Aarthi Kumaraswamy
09 Dec 2017
3 min read
Save for later

Week at a Glance (2nd Dec – 8th Dec 2017): Top News from Data Science

Aarthi Kumaraswamy
09 Dec 2017
3 min read
This week, NIPS 2017, hardware improvements especially GPUs becoming more AI friendly, and Bitcoin with the allied cryptocurrency and blockchain ecosystem take center stage. NIPS 2017 Highlights - Part 1 3 great ways to leverage Structures for Machine Learning problems by Lise Getoor at NIPS 2017 20 lessons on bias in machine learning systems by Kate Crawford at NIPS 2017 Top Research papers showcased at NIPS 2017 – Part 2 Top Research papers showcased at NIPS 2017 – Part 1 Watch out for more in this area in the coming weeks. News Highlights Titan V is the “most powerful PC GPU ever created,” Nvidia says OpenAI announces Block sparse GPU kernels for accelerating neural networks Amazon, Facebook, and Microsoft announce the general availability of ONNX v0.1 Qualcomm Snapdragon 845 processor to have state-of-art camera, AI, and VR features PyTorch 0.3.0 releases, ending stochastic functions DeepVariant: Using Artificial Intelligence for Human Genome Sequencing AWS IoT Analytics: The easiest way to run analytics on IoT data, Amazon says Introducing Amazon Neptune: A graph database service for your applications Aurora Serverless: No servers, no instances to set up! You pay for only what you use Amazon unveils Sagemaker: An end-to-end machine learning service In other News 8th Dec.’ 17 – News Headlines Coinbase becomes No. 1 iPhone app in US, crashes on demand Overledger: Quant Network creates cross-blockchain data interoperability technology Syte.ai unveils new API “Visual Search for All” for online fashion retailers Honda teams up with China’s SenseTime on AI tech for self-driving cars 7th Dec.’ 17 – News Headlines NVIDIA’s CUTLASS to help develop new algorithms in CUDA C++ using high-performance GEMM constructs as building blocks Google rolls out new machine learning features into its spreadsheet software to save time, get ‘intuitive’ answers Status, the first-ever Mobile Ethereum OS, joins the Enterprise Ethereum Alliance 6th Dec.’ 17 – News Headlines Microsoft Azure is first global cloud provider to deploy AMD EPYC processors Power System AC922: IBM takes deep learning to next level with first Power9-based systems Study on AI’s win in heads-up no-limit Texas hold’em poker wins Best Paper award at NIPS 2017 Google announces Apple’s Core ML support in TensorFlow Lite Google announces Cloud Video Intelligence and Cloud Natural Language Content Classification are now generally available 5th Dec.’ 17 –News  Headlines Nvidia GPU Cloud to now support everyday desktops IBM claims 10x faster machine learning with its new DuHL algorithm New product from Falcon Computing lets software programmers design FPGA accelerators without any knowledge of FPGA Google AutoML’s “child” NASNet delivers advanced machine vision results 4th Dec.’ 17 – News Headlines Amazon announces new AWS Deep Learning AMI for Microsoft Windows Mapbox acquires augmented reality activity tracking app Fitness AR Home.me turns your 2D floor plan drawings into 3D renderings
Read more
  • 0
  • 0
  • 1198

article-image-8th-dec-17-headlines
Packt Editorial Staff
08 Dec 2017
6 min read
Save for later

8th Dec.' 17 - Headlines

Packt Editorial Staff
08 Dec 2017
6 min read
OpenAI's Block sparse GPU kernels, Nvidia's Titan V desktop GPU, Coinbase's surge on bitcoin hike, and a new blockchain Overledger to link existing blockchains, among today's trending stories in artificial intelligence, machine learning, and data science news. Nvidia doesn’t support sparse matrix networks, so OpenAI created “Block sparse GPU kernels” AI research firm OpenAI launches software to speed up GPU-powered neural networks OpenAI announced it has developed a library of tools that can help researchers build faster, more efficient neural networks that take up less memory on GPUs. Because Nvidia (the biggest manufacturer of GPUs for neural networks) doesn’t support sparse matrix networks in its hardware, OpenAI has created what it calls “block sparse GPU kernels” to create these sparse networks on Nvidia’s chips. OpenAI said it used its enhanced neural networks to perform sentiment analysis on user reviews on websites including Amazon and IMDB and reported some impressive performance gains. “The sparse model outperforms the dense model on all sentiment datasets,” OpenAI’s researchers wrote in a blog post. “Our sparse model improves the state of the art on the document level IMDB dataset from 5.91 per cent error to 5.01 per cent. This is a promising improvement over our previous results which performed best only on shorter sentence level datasets.” The kernels are written in Nvidia’s CUDA programming language, and are currently only compatible with the TensorFlow deep learning framework. They also only support Nvidia GPUs, but can be expanded to support other frameworks and hardware. OpenAI said it wants to “freely collaborate” with other institutions and researchers so that its block sparse GPU kernels can be used for other projects. The code is available on GitHub. Nvidia’s “most powerful PC GPU ever created” Nvidia announces $2,999 PC GPU “Titan V”— the Volta-powered GPU delivers 110 Teraflops of Deep Learning Horsepower, 9x its predecessor Nvidia has just announced its new flagship graphics card, the TITAN V, based on the architecture of its "Volta" GV100 graphics processor. It marks a new era, as Titan V is NVIDIA's first HBM2-equipped prosumer graphics card available for the masses. It comes with 12 GB of HBM2 memory across a 3072-bit wide memory interface. The GPU is based on 5120 shader processors, 640 Tensor cores and gets 320 Texture units. It has a base clock of 1200 Mhz and a 1455 MHz boost clock. The 12 GB memory runs at 1.7 Gbps (three HBM2 4GB stacks), and the card has 6-pin and 8-pin PCIe power connectors. The display outputs show three DP and one HDMI connectors. Having the regular vapor chamber cooler with a copper heatsink, the card has 16 power phases with 250W TDP. Notably, there are no SLI fingers, and instead, Nvidia seems to be using NVLink connections at the top of the PCB. Priced at a staggering $2,999, the GPU is available only through the NVIDIA store. More info will follow soon, Nvidia said. The insane Bitcoin bubble that is underway.. Coinbase becomes No. 1 iPhone app in US, crashes on demand With the bitcoin surging at an unprecedented pace over the last few days, Coinbase has suddenly become the most downloaded app in the U.S. The popular bitcoin wallet ranked around 400th on the free chart in the App Store less than a month ago, but has spiked to the top slot, beating out the likes of YouTube, Facebook Messenger, and Instagram. Coinbase’s rise to the top of the APP store is attributed to the ongoing crazy ride of bitcoins as the cryptocurrency has skyrocketed from just under $10,000 at the start of the week to over $18,000 in no time. So much so that Coinbase is now unable to handle the demand load. For large portions of the day, its service was unavailable and the app was hanging quite often. Coinbase later tweeted that its site was “down for maintenance” as they were experiencing a record high traffic. "The Blockchain to Fix All Blockchains" Overledger: Quant Network creates cross-blockchain data interoperability technology London-based Quant Network has launched Overledger, a technology for data interoperability across different blockchains. The idea is to do something similar to TCP/IP, which enabled the internet. "The uniqueness of our operating system is that Overledger is not another blockchain,” Quant Network Chief Strategist Paolo Tasca said. “We do not impose new consensus mechanisms, new gateways, adapters or special validating nodes on top of existing blockchains. Overledger is a virtual blockchain that links existing blockchains and allows developers to build multi-chain applications (or in other terms blockchain-agnostic applications)." Gilbert Verdian, CEO and co-founder of Quant Network, confirmed that a patent for Overledger technology was filed in the first week of December. According to Verdian, Quant Network is focussing on three goals: developing an API to connect the world’s networks to multiple blockchains; bridging existing networks (e.g financial services) to new blockchains; and developing a new Blockchain Operating System with a protocol and a platform to create next-generation, multi-chain applications. Machine learning in fashion searches.. Syte.ai unveils new API “Visual Search for All” for online fashion retailers Syte.ai, a visual search startup just for fashion, has launched a new API that makes adding visual search accessible to more e-commerce sites. Called Visual Search for All, the white-label feature can be integrated into retail websites or apps within 24 hours and lets shoppers upload photos saved on their phones, like screenshots from Instagram, to find similar products on sale. It is based on the same technology as Syte.ai’s search tools for large fashion brands and publishers, which shows shoppers relevant items when they hover a cursor over part of an image. “Once it indexes a brand’s product feed, Visual Search for All can be added to a site’s search bar in less than a day by adding a line of HTML,” Co-founder Lihi Pinto Fryman said, noting that clients pay a monthly license fee based on the number of image-matches likely to be used. Facebook Messenger and Line users can try out Syte.ai’s technology by sending images to its chatbot, Syte Inspire. The Israeli fashion tech startup had raised $8 million from investors including top Asian tech firms NHN, Line Corp. and Naver, earlier this year. Honda’s Self-driving Cars project Honda teams up with China’s SenseTime on AI tech for self-driving cars Honda has signed a 5-year joint research and development agreement with China’s SenseTime, an IT firm specializing in artificial intelligence, for self-driving cars technology. As part of its 2030 Vision strategy announced in June, Honda aims to have a car with Level 4 self-driving capability on sale by the year 2025. According to Honda, SenseTime excels in image recognition technologies, especially recognition of moving objects, powered by deep learning technology. In their new partnership, Honda will join its AI algorithms for environment understanding, risk prediction and action planning with SenseTime’s moving object recognition technologies. The goal is to develop a reliable self-driving system that will be able to handle both highways and complex urban environments, the automaker said.
Read more
  • 0
  • 0
  • 916

article-image-nvidia-titan-v-gpu-volta-processor
Abhishek Jha
08 Dec 2017
4 min read
Save for later

Titan V is the “most powerful PC GPU ever created,” Nvidia says

Abhishek Jha
08 Dec 2017
4 min read
Nvidia has developed a certain knack of releasing only the “most powerful” products of late. Never mind. Let’s go beyond the verbiage – they are indeed the market leaders in chip making, after all. Their latest announcement, Titan V, has nine times the power of its predecessor, the $1,200 Titan Xp. As Nvidia CEO Jensen Huang lit up a gathering of hundreds of elite deep learning researchers at the Neural Information Processing Systems AI conference — better known as NIPS — by unveiling TITAN V, it marked quite a new era. The new flagship processor is Nvidia's first HBM2-equipped prosumer graphics card available for the masses. And the chip giant claims Titan V could even transform a PC into an AI supercomputer! [embed width="" height=""]https://www.youtube.com/watch?v=NPrfiOldKf8&feature=youtu.be[/embed] Huang said Titan V was tailormade for “breakthrough discoveries” across high performance computing (HPC) and artificial intelligence. It excels at computational processing for scientific simulation. Its 21.1 billion transistors deliver 110 teraflops of raw horsepower, 9x that of its predecessor, with extreme energy efficiency. “With TITAN V, we are putting Volta into the hands of researchers and scientists all over the world.. We broke new ground with its new processor architecture, instructions, numerical formats, memory architecture and processor links,” he said. Design details Titan V is based on the architecture of Nvidia’s "Volta" GV100 graphics processor. It comes with 12 GB of HBM2 memory across a 3072-bit wide memory interface. The GPU features 5120 Cuda cores and additional 640 tensor cores that have been optimized to speed up machine learning workloads. Titan V’s Volta architecture features a major redesign of the streaming multiprocessor that is at the center of the GPU. It doubles the energy efficiency of the previous generation Pascal design, enabling dramatic boosts in performance in the same power envelope. New Tensor Cores designed specifically for deep learning deliver up to 9x higher peak teraflops. With independent parallel integer and floating-point data paths, Volta is also much more efficient on workloads with a mix of computation and addressing calculations. Its new combined L1 data cache and shared memory unit significantly improve performance while also simplifying programming. Fabricated on a new TSMC 12-nanometer FFN high-performance manufacturing process customized for NVIDIA, TITAN V also incorporates Volta’s highly tuned 12GB HBM2 memory subsystem for advanced memory bandwidth utilization. Reference GeForce Titan V Titan Xp Titan X GTX 1080 GTX 1070 GTX 1060 Die Size (815mm²) (471mm²) (471mm²) GPU GV100 GP102-400-A1 GP102-400-A1 GP104-400-A1 GP104-200-A1 GP106-400-A1 Architecture Volta Pascal Pascal Pascal Pascal Pascal Transistor count 21 Billion 12 Billion 12 Billion 7.2 Billion 7.2 Billion 4.4 Billion Fabrication Node TSMC 12 nm FinFET+ TSMC 16 nm TSMC 16 nm TSMC 16 nm TSMC 16 nm TSMC 16 nm CUDA Cores 5,120 3,840 3,584 2,560 1,920 1,280 SMMs / SMXs 40 30 28 20 15 10 ROPs n/a 96 96 64 64 48 GPU Clock Core 1,200 MHz 1,405 MHz 1,417 MHz 1,607 MHz 1,506 MHz 1,506 MHz GPU Boost clock 1,455 MHz 1582 MHz 1,531 MHz 1,733 MHz 1,683 MHz 1,709 MHz Memory Clock 1700 MHz 2852 MHz 2500 MHz 1,250 MHz 2,000 MHz 2,000 MHz Memory Size 12 12 GB 12 GB 8 GB 8 GB 3 GB / 6 GB Memory Bus 3072-bit 384-bit 384-bit 256-bit 256-bit 192-bit Memory Bandwidth 653 GB/s 547 GB/s 480 GB/s 320 GB/s 256 GB/s 192 GB/s FP Performance 15 TFLOPS 12.0 TFLOPS 11.0 TFLOPS 9.0 TFLOPS 6.45 TFLOPS 4.61 TFLOPS GPU Thermal Threshold 91 Degrees C 97 Degrees C 94 Degrees C 94 Degrees C 94 Degrees C 94 Degrees C TDP 250 Watts 250 Watts 250 Watts 180 Watts 150 Watts 120 Watts Launch MSRP ref $2999 $1200 $1200 $599/$699 $379/$449 $249/$299  Source: guru3D Availability Titan V is ideal for developers who want to use their PCs to do work in AI, deep learning and high performance computing. Users of Titan V can gain immediate access to the latest GPU-optimized AI, deep learning and HPC software by signing up at no charge for an NVIDIA GPU Cloud account. Priced at a staggering $2,999, Titan V is available to purchase only from the Nvidia stores in participating countries. One final thing about Titan V. The GPU's color in gold and black looks pretty cool!
Read more
  • 0
  • 0
  • 2139

article-image-openai-block-sparse-kernels-accelerating-neural-networks
Savia Lobo
08 Dec 2017
3 min read
Save for later

OpenAI announces Block sparse GPU kernels for accelerating neural networks

Savia Lobo
08 Dec 2017
3 min read
OpenAI, an Artificial intelligence research firm, brings in a wave of faster GPUs with their new GPU kernels, Block-Sparse GPU Kernels--software programs optimized to build sparse networks on Nvidia’s hardware chips. These help in building faster yet efficient neural networks. Also, it won’t eat up much of memory space on your GPUs. Neural networks are a complex branch of AI and are built using layers of connected nodes. However, their processing power is restricted by the architecture of the GPUs that they run on. Due to which, neural networks lack the presence of an efficient GPU implementation for sparse linear operations. Researchers at OpenAI say that it is now possible to make neural networks highly efficient by bringing in sparse matrices into their design. How sparse matrix helps GPUs A sparse matrix is simply a mathematical matrix filled in with multiple entries of value zero. Such zero-valued elements can be easily compresses and detoured within matrix multiplications, which in turn saves computation time and also takes up lesser memory on GPUs. Source: https://blog.openai.com/block-sparse-gpu-kernels/ The saved computational power can be later on used to train deep neural networks more efficiently. This means, neural networks can multi-task by performing inference, and running algorithms simultaneously, that too 10 times faster than the regular matrices. The problem that OpenAi face with these sparse matrix is, Nvidia, the biggest name in the manufacturing of GPUs for neural networks does not have  a support for sparse matrix models within its hardware. Enter Block sparse GPU kernels... Block sparse GPU kernels: Sparse matrix gets an upgrade To overcome the problem with sparsity within the Nvidia hardware, a team of researchers at OpenAI developed Block sparse GPU kernels. Source:  https://blog.openai.com/block-sparse-gpu-kernels/ Key points to note about block sparse GPU kernels: They are written in Nvidia’s CUDA programming language. At present, they are only compatible with TensorFlow Also, they only support Nvidia’s GPUs. OpenAI also declared that it is sharing its block sparse GPU kernels with the wider research community in order to put it to use in other developments. Also, these kernels would be expanded to support other hardware and frameworks. OpenAI used the neural network enhanced with the block sparse GPU kernels, to carry out sentiment analysis on the reviews for IMDB and Amazon. The result was, these sparse models won over the dense models on all sentiment datasets. Source: https://s3-us-west-2.amazonaws.com/openai-assets/blocksparse/blocksparsepaper.pdf OpenAI also mentioned that their sparse model improved at a state-of-the-art level on the IMDB dataset from 5.91% error to 5.01%. They say it has been a promising improvement over their previous results, which performed extremely well on shorter sentence level datasets. As these new kernels seem very promising, the OpenAI research team does not have an ultimate view on when and where these kernels would help. The community promises to explore this space further. To learn how to install and develop Block sparse GPU kernels, click on the GitHub link here.  
Read more
  • 0
  • 0
  • 3091
article-image-general-availability-onnx-1-0
Savia Lobo
08 Dec 2017
2 min read
Save for later

Amazon, Facebook and Microsoft announce the general availability of ONNX v0.1

Savia Lobo
08 Dec 2017
2 min read
Amazon, Facebook, and Microsoft have recently rolled out an exciting announcement for developers. The news is...                                          ONNX 1.0 format is now production ready! Open Neural Network Exchange (ONNX) format allows interoperability feature between various deep learning frameworks such as Caffe2, Apache MXNet, Microsoft Cognitive Toolkit (CNTK), and PyTorch. With the new interoperable feature, the version 1.0 allows users to get their deep learning models into production at a much faster pace. One can also train the model on one framework (PyTorch, for instance), and carry-out inference on another framework (Microsoft CNTK or Apache MXNet). Since the initial release of ONNX in the month of September, many communities are getting involved and adopting ONNX within their organizations--Amazon, Facebook, and Microsoft being the major ones. Many hardware-based organizations such as Qualcomm, Huawei, and Intel have announced an ONNX support for their hardware platforms. This gives users the freedom to run their models on different hardware platforms. Also, making frequent use of different frameworks results into integrating optimizations separately within each framework. Here, ONNX makes it easy for optimization to reach more developers . Tools for ONNX 1.0 Netron Netron is a viewer for ONNX neural network models. It is capable of running on macOS, Windows, Linux and serves models via a Python web server. For a more detailed overview on Netron, visit the GitHub link here. Net Drawer The Net drawer tool is used to visualize the ONNX models. This tool takes a serialized ONNX model as input and processes a directed graph representation. The output graph contains information on input/output tensors, tensor names, operator types and numbers, and so on. To know more about the working of Net drawer tool visit the GitHub link here. At present, ONNX models are supported in frameworks such as MXNet, Microsoft Cognitive Toolkit, PyTorch, and Caffe2. However, there are connectors for other common frameworks and libraries as well. Also, the current version of ONNX is designed keeping computer vision applications in mind. Amazon, Facebook, and Microsoft communities along with the ONNX community and its partners are working in union to expand beyond vision applications in the future versions of ONNX. To know more about ONNX 1.0 in detail, please visit GitHub , or the  ONNX Website.
Read more
  • 0
  • 0
  • 2019

article-image-7th-dec-17-headlines
Packt Editorial Staff
07 Dec 2017
6 min read
Save for later

7th Dec.' 17 - Headlines

Packt Editorial Staff
07 Dec 2017
6 min read
NVIDIA's CUTLASS, ONNX 1.0, Qualcomm’s Snapdragon 845 chip, Ethereum interface Status, and new machine learning changes in Google Sheets in today's top stories around machine learning, AI and data science news. Announcing CUTLASS for fast linear algebra in CUDA C++ NVIDIA’s CUTLASS to help develop new algorithms in CUDA C++ using high-performance GEMM constructs as building blocks NVIDIA has released CUTLASS: CUDA Templates for Linear Algebra Subroutines. It’s a collection of CUDA C++ templates and abstractions for implementing high-performance GEMM computations at all levels and scales within CUDA kernels. Unlike other templated GPU libraries for dense linear algebra, the purpose of CUTLASS is to decompose the “moving parts” of GEMM into fundamental components abstracted by C++ template classes, allowing programmers to easily customize and specialize them within their own CUDA kernels. “Our CUTLASS primitives include extensive support for mixed-precision computations, providing specialized data-movement and multiply-accumulate abstractions for handling 8-bit integer, half-precision floating point (FP16), single-precision floating point (FP32), and double-precision floating point (FP64) types,” NVIDIA said in its statement. “One of the most exciting features of CUTLASS is an implementation of matrix multiplication that runs on the new Tensor Cores in the Volta architecture using the WMMA API. Tesla V100’s Tensor Cores are programmable matrix-multiply-and-accumulate units that can deliver up to 125 Tensor TFLOP/s with high efficiency.” NVIDIA is releasing the CUTLASS source code on GitHub as an initial exposition of CUDA GEMM techniques that will evolve into a template library API. ONNX is production ready Announcing ONNX 1.0 Open Neural Network Exchange (ONNX), a joint initiative from Facebook and Microsoft later joined by Amazon Web Services, has reached the production milestone of verson 1.0. ONNX 1.0 enables users to move deep learning models between frameworks, making it easier to put them into production. For example, developers can build sophisticated computer vision models using frameworks such as PyTorch and run them for inference using Microsoft Cognitive Toolkit or Apache MXNet. Since the initial release of ONNX in September, numerous hardware partners including Qualcomm, Huawei, and Intel have announced support of the ONNX format for their hardware platforms, making it easier for users to run models on different hardware platforms. Qualcomm’s new flagship chip Snapdragon 845 Qualcomm’s next generation processor Snapdragon 845 focuses on AI, augmented and virtual reality At their annual Snapdragon Technology Summit, Qualcomm announced updates about their latest premium processor due out next year: Snapdragon 845. Though the processor will be built on the same 10 nm processor technology like the 835 (the previous processor), Qualcomm has made some changes in the architecture to embrace next-generation AR and VR applications. In addition to more focus on imaging and AI, it will have a robust battery life. Snapdragon 845 will still support Gigabit LTE speeds via X20 modem. It will feature four Cortex A75 and four Cortex A53 cores as processing module. In the Summit, Xiaomi also made an appearance to announce that their upcoming flagship phone will come equipped with the Snapdragon 845 processor. Whereas the chip will be found in other non-Android devices as well, including Windows 10 PCs. Spectra 280 ISP and Adreno 630 are the new features added to improve photography and video capture, in addition to SLAM (simultaneous localization and mapping) for obstacle collision. Qualcomm also said that the new chip will have a 3x performance boost in AI. The company has added support for TensorFlow Lite and ONNX frameworks, apart from the regular old TensorFlow and Caffe. Google spreadsheet getting ‘smarter’ Google rolls out new machine learning features into its spreadsheet software to save time, get ‘intuitive’ answers Google said it is enhancing the “Explore” feature in Sheets with new capabilities including formula suggestions and pivot tables powered by machine learning to deliver faster and more useful insights. Sheets is part of Google’s productivity suite, meant to rival Microsoft Corp.’s popular Excel spreadsheet software. At present, users typically type quick formulas such as =SUM or =AVERAGE into sheets to get answers for their data. But now Google wants to introduce machine intelligence into the process, so that when users begin typing a formula, Sheets will pop up a few suggestions for full formulas based on the context of the data in the specific spreadsheet. Next, creating pivot tables has always been tricky and time-consuming. So now, Sheets can ‘intelligently’ suggest pivot tables to find the answers. And users can ask questions using everyday language. For example, they can type “what is the sum of revenue by salesperson?” and Sheets will suggest the best pivot table to find the answer to that question. Additional new features in Sheets include a new user interface for pivot tables, along with customizable headings for rows and columns, and also some new ways to view data. “Now, when you create a pivot table, you can ‘show values as a % of totals’ to see summarized values as a fraction of grand totals,” Google said. “Once you have a table, you can right-click on a cell to ‘view details’ or even combine pivot table groups to aggregate data the way you need it.” Google also added the ability to create “waterfall charts,” which provide a way to visualize sequential changes in data. User can now also quickly import or paste “fixed-width formatted data files” to Sheets. The new updates will be implemented in coming weeks. A new entrant to Enterprise Ethereum Alliance Status, the first-ever Mobile Ethereum OS, joins the Enterprise Ethereum Alliance Status, the Ethereum blockchain-based decentralized browser with built-in chat and wallet functionality, has joined the Enterprise Ethereum Alliance (EEA), the world’s largest open-source blockchain initiative. With membership across the Fortune 500, enterprises, startups, research facilities and even Governments, the EEA’s mission is to enhance the privacy, security, and scalability of Ethereum-based blockchain technologies. Status recently closed a $100 million funding round through the sale of its SNT tokens. Currently in development, it is an open source mobile platform that serves as a gateway to decentralized apps (DApps) and services built on Ethereum. The base offering enables access to encrypted messages, smart contracts, digital currency and more.
Read more
  • 0
  • 0
  • 1143

article-image-qualcomm-snapdragon-845-processor
Abhishek Jha
07 Dec 2017
5 min read
Save for later

Qualcomm Snapdragon 845 processor to have state-of-art camera, AI, and VR features

Abhishek Jha
07 Dec 2017
5 min read
Qualcomm’s next flagship processor Snapdragon 845 is likely to be found in high-end Android phones in early 2018, the company said at its Snapdragon Tech Summit. The 845 is a direct successor to last year’s hugely popular Snapdragon 835, and brings amplified performance, better connectivity, and more efficient power usage. As anticipated, Snapdragon 845 will be built on the same 10 nm processor technology like the 835, but there are changes in architecture that significantly boost camera videography and virtual and augmented reality experiences on a mobile platform. That means the next bunch of flagship phones using the processor will be very capable handsets. And most likely it is going to be Samsung Galaxy S9 (the ‘coincidence’ ahead of its January launch is interesting)! Like its predecessors, the 845 is not just a processor, but what we call an ‘alphabet soup’ of processors and systems, such as a CPU, GPU, ISP, and DSP, all contained on a single unit. Qualcomm has paired Snapdragon 845 with its latest X20 LTE modem, which provides gigabit connectivity on supported networks. But, the big new feature here, is the ability to capture 4K video in HDR at up to 60 frames per second. New architectures — Spectra 280 ISP and Adreno 630 — have been designed to improve photography and video capture. The 845 can also shoot slo-mo 720p video at 480 frames per second and supports 1080p video recording at up to 240 frames per second. Also new here is SLAM (simultaneous localization and mapping) that can be used for obstacle collision, and in augmented and virtual reality. And if that’s not all, Snapdragon 845 has the configuration to run artificial intelligence algorithms and machine learning on the chipset (rather than relying on connecting with cloud-based services). The chipset uses both its CPU and GPU to do this, but also has the Hexagon 685 digital signal processor (DSP) which acts as an AI co-processor giving the Snapdragon 845 a 3x performance boost in AI tasks over 835. There are a lot of other improvements. Qualcomm said its new Adreno 630 GPU provides 30% better graphics, 30% better power efficiency, and 2.5x faster display throughput. It also supports room scale 6DOF and can map rooms in real time for VR and AR applications. Snapdragon 845 Process 10 nm FinFET CPU 8x Kryo 385 (4x Cortex-A75 up to 2.8 GHz + 4x Cortex-A55 up to 1.8 GHz) GPU / VPS Adreno 630 Camera Up to 32MP / 16MP +16MP Video Recording 4K HDR Max screen resolution 2x 2400x2400 @ 120 FPS (VR) LTE 1.2 Gbps / 150 Mbps Wi-Fi 802.11ad Multi-gigabit AI Platform Hexagon 685 QuickCharge QuickCharge 4/4+ The eight CPU cores are Kryo 385 that due to a redesign of the platform will provide “25% performance uplift” as Qualcomm claims. Four of them are dedicated for performance and are based on the Cortex-A75 architecture, while the other four are based on the Cortex-A55 and run up to 1.8 GHz. With the new GPU Adreno 630, Qualcomm promises “30% faster graphics and 30% better power efficiency”. The new unit should bring 2.5x faster display throughput, meaning a 2K x 2K display can efficiently run at 120 Hz. With eye tracking, hand tracking, and multiview rendering features, Qualcomm goes on to call Adreno 630 a “Visual Processing Subsystem”. Even better thing is that Qualcomm has freed up developer choice. The San Diego chipmaker has added support for TensorFlow Lite and the new Open Neural Network Exchange (ONNX) frameworks, in addition to regular old TensorFlow and Caffe. As far as security is concerned, there’s a devoted secure processing unit on-board, which should bring improvements to biometrics and encryption. Battery life is another key upgrade, with the Snapdragon 845 promising almost one-third power reduction for energy-hungry features like video capture, AR/VR and gaming. Interestingly, Xiaomi also made an appearance in the summit to announce that their upcoming flagship phone will come equipped with Snapdragon 845 chip. And let us not forget that Microsoft has already started using Qualcomm 835 chips on Windows 10. Further to that announcement, Snapdragon 845 will be actually making its way to Windows 10 PCs by the end of 2018. Here is a quick view of the features Snapdragon 845 includes: Qualcomm Spectra 280 ISP Ultra HD premium capture Qualcomm Spectra Module Program, featuring Active Depth Sensing MCTF video capture Multi-frame noise reduction High performance capture up to 16MP @60FPS Slow motion video capture (720p @480 fps) ImMotion computational photography Adreno 630 Visual Processing Subsystem 30% improved graphics/video rendering and power reduction compared to previous generation Room-scale 6 DoF with SLAM Adreno foveation, featuring tile rendering, eye tracking, multiView rendering, fine grain preemption 2K x 2K @ 120Hz, for 2.5x faster display throughput Improved 6DoF with hand-tracking and controller support Qualcomm® Hexagon™ 685 DSP 3rd Generation Hexagon Vector DSP (HVX) for AI and imaging 3rd Generation Qualcomm All-Ways AwareTM Sensor Hub  Hexagon scalar DSP for audio Snapdragon X20 LTE modem Support for 1.2 Gbps Gigabit LTE Category 18 License Assisted Access (LAA) Citizens Broadband Radio Service (CBRS) shared radio spectrum Dual SIM-Dual VoLTE (DSDV) Connectivity Multigigabit 11ad Wi-Fi with diversity module Integrated 2x2 11ac Wi-Fi with Dual Band Simultaneous (DBS) support 11k/r/v: Carrier Wi-Fi enhanced mobility, fast acquisition and congestion mitigation Bluetooth 5 with proprietary enhancements for ultra-low power wireless ear bud support and direct audio broadcast to multiple devices Secure Processing Unit Biometric authentication (fingerprint, iris, voice, face) User and app data protection Integrated use-cases such as integrated SIM, Payments, and more Qualcomm Aqstic Audio Qualcomm Aqstic audio codec(WCD934x): Playback: Dynamic range: 130dB, THD+N: -109dB Native DSD support (DSD64/DSD128), PCM up to 384kHz/32bit Low power voice activation: 0.65mA Record: Dynamic range: 109dB, THD+N: -103dB Sampling: Up to 192kHz/24bit Qualcomm Quick Charge 4+ Kryo 385 CPU Four performance cores up to 2.8GHz (25 percent performance uplift compared to previous generation) Four efficiency cores up to 1.8GHz 2MB shared L3 cache (new) 3MB system cache (new) For more information on Snapdragon 845, please visit Qualcomm's site.
Read more
  • 0
  • 0
  • 2055
article-image-pytorch-0-3-0-released
Abhishek Jha
06 Dec 2017
13 min read
Save for later

PyTorch 0.3.0 releases, ending stochastic functions

Abhishek Jha
06 Dec 2017
13 min read
PyTorch 0.3.0 has removed stochastic functions, i.e. Variable.reinforce(), citing “limited functionality and broad performance implications.” The Python package has added a number of performance improvements, new layers, support to ONNX, CUDA 9, cuDNN 7, and “lots of bug fixes” in the new version. “The motivation for stochastic functions was to avoid book-keeping of sampled values. In practice, users were still book-keeping in their code for various reasons. We constructed an alternative, equally effective API, but did not have a reasonable deprecation path to the new API. Hence this removal is a breaking change,” PyTorch team said. To replace stochastic functions, they have introduced the torch.distributions package. So if your previous code looked like this: probs = policy_network(state) action = probs.multinomial() next_state, reward = env.step(action) action.reinforce(reward) action.backward() This could be the new equivalent code: probs = policy_network(state) # NOTE: categorical is equivalent to what used to be called multinomial m = torch.distributions.Categorical(probs) action = m.sample() next_state, reward = env.step(action) loss = -m.log_prob(action) * reward loss.backward()   What is new in PyTorch 0.3.0?   Unreduced losses Now, Some loss functions can compute per-sample losses in a mini-batch By default PyTorch sums losses over the mini-batch and returns a single scalar loss. This was limiting to users. Now, a subset of loss functions allow specifying reduce=False to return individual losses for each sample in the mini-batch Example: loss = nn.CrossEntropyLoss(..., reduce=False) Currently supported losses: MSELoss, NLLLoss, NLLLoss2d, KLDivLoss, CrossEntropyLoss, SmoothL1Loss, L1Loss More loss functions will be covered in the next release An in-built Profiler in the autograd engine PyTorch has built a low-level profiler to help you identify bottlenecks in your models. Let us start with an example: >>> x = Variable(torch.randn(1, 1), requires_grad=True) >>> with torch.autograd.profiler.profile() as prof: ... y = x ** 2 ... y.backward() >>> # NOTE: some columns were removed for brevity ... print(prof) -------------------------------- ---------- --------- Name CPU time CUDA time ------------------------------- ---------- --------- PowConstant 142.036us 0.000us N5torch8autograd9GraphRootE 63.524us 0.000us PowConstantBackward 184.228us 0.000us MulConstant 50.288us 0.000us PowConstant 28.439us 0.000us Mul 20.154us 0.000us N5torch8autograd14AccumulateGradE 13.790us 0.000us N5torch8autograd5CloneE 4.088us 0.000us The profiler works for both CPU and CUDA models. For CUDA models, you have to run your python program with a special nvprof prefix. For example: nvprof --profile-from-start off -o trace_name.prof -- python <your arguments> # in python >>> with torch.cuda.profiler.profile(): ... model(x) # Warmup CUDA memory allocator and profiler ... with torch.autograd.profiler.emit_nvtx(): ... model(x) Then, you can load trace_name.prof in PyTorch and print a summary profile report. >>> prof = torch.autograd.profiler.load_nvprof('trace_name.prof') >>> print(prof) For additional documentation, you can visit this link. Higher order gradients v0.3.0 has added higher-order gradients support for the following layers: ConvTranspose, AvgPool1d, AvgPool2d, LPPool2d, AvgPool3d, MaxPool1d, MaxPool2d, AdaptiveMaxPool, AdaptiveAvgPool, FractionalMaxPool2d, MaxUnpool1d, MaxUnpool2d, nn.Upsample, ReplicationPad2d, ReplicationPad3d, ReflectionPad2d PReLU, HardTanh, L1Loss, SoftSign, ELU, RReLU, Hardshrink, Softplus, SoftShrink, LogSigmoid, Softmin, GLU MSELoss, SmoothL1Loss, KLDivLoss, HingeEmbeddingLoss, SoftMarginLoss, MarginRankingLoss, CrossEntropyLoss DataParallel Optimizers optim.SparseAdam: Implements a lazy version of Adam algorithm suitable for sparse tensors. (In this variant, only moments that show up in the gradient get updated, and only those portions of the gradient get applied to the parameters.) Optimizers now have an add_param_group function that lets you add new parameter groups to an already constructed optimizer. New layers and nn functionality Added AdpativeMaxPool3d and AdaptiveAvgPool3d Added LPPool1d F.pad now has support for: 'reflection' and 'replication' padding on 1d, 2d, 3d signals (so 3D, 4D and 5D Tensors) constant padding on n-d signals nn.Upsample now works for 1D signals (i.e. B x C x L Tensors) in nearest and linear modes. Allow user to not specify certain input dimensions for AdaptivePool*d and infer them at runtime. For example: # target output size of 10x7 m = nn.AdaptiveMaxPool2d((None, 7)) DataParallel container on CPU is now a no-op (instead of erroring out) New Tensor functions and features Introduced torch.erf and torch.erfinv that compute the error function and the inverse error function of each element in the Tensor. Adds broadcasting support to bitwise operators Added Tensor.put_ and torch.take similar to numpy.take and numpy.put. The take function allows you to linearly index into a tensor without viewing it as a 1D tensor first. The output has the same shape as the indices. The put function copies value into a tensor also using linear indices. Adds zeros and zeros_like for sparse Tensors. 1-element Tensors can now be casted to Python scalars. For example: int(torch.Tensor([5]))works now. Other additions Added torch.cuda.get_device_name and torch.cuda.get_device_capability that do what the names say. Example: >>> torch.cuda.get_device_name(0) 'Quadro GP100' >>> torch.cuda.get_device_capability(0) (6, 0) If one sets torch.backends.cudnn.deterministic = True, then the CuDNN convolutions use deterministic algorithms torch.cuda_get_rng_state_all and torch.cuda_set_rng_state_all are introduced to let you save / load the state of the random number generator over all GPUs at once torch.cuda.emptyCache() frees the cached memory blocks in PyTorch's caching allocator. This is useful when having long-running ipython notebooks while sharing the GPU with other processes. API changes softmax and log_softmax now take a dim argument that specifies the dimension in which slices are taken for the softmax operation. dim allows negative dimensions as well (dim = -1 will be the last dimension) torch.potrf (Cholesky decomposition) is now differentiable and defined on Variable Remove all instances of device_id and replace it with device, to make things consistent torch.autograd.grad now allows you to specify inputs that are unused in the autograd graph if you use allow_unused=True This gets useful when using torch.autograd.grad in large graphs with lists of inputs / outputs For example: x, y = Variable(...), Variable(...) torch.autograd.grad(x * 2, [x, y]) # errors torch.autograd.grad(x * 2, [x, y], allow_unused=True) # works pad_packed_sequence now allows a padding_value argument that can be used instead of zero-padding Dataset now has a + operator (which uses ConcatDataset). You can do something like MNIST(...) + FashionMNIST(...) for example, and you will get a concatenated dataset containing samples from both. torch.distributed.recv allows Tensors to be received from any sender (hence, src is optional). recv returns the rank of the sender. adds zero_() to Variable Variable.shape returns the size of the Tensor (now made consistent with Tensor) torch.version.cuda specifies the CUDA version that PyTorch was compiled with Added a missing function random_ for CUDA. torch.load and torch.save can now take a pathlib.Path object, which is a standard Python3 typed filepath object If you want to load a model's state_dict into another model (for example to fine-tune a pre-trained network), load_state_dict was strict on matching the key names of the parameters. Now Pytorch provides a strict=False option to load_state_dict where it only loads in parameters where the keys match, and ignores the other parameter keys. added nn.functional.embedding_bag that is equivalent to nn.EmbeddingBag Performance Improvements The overhead of torch functions on Variables was around 10 microseconds. This has been brought down to ~1.5 microseconds by moving most of the core autograd formulas into C++ using ATen library. softmax and log_softmax are now 4x to 256x faster on the GPU after rewriting the gpu kernels 2.5x to 3x performance improvement of the distributed AllReduce (gloo backend) by enabling GPUDirect nn.Embedding's renorm option is much faster on the GPU. For embedding dimensions of 100k x 128 and a batch size of 1024, it is 33x faster. All pointwise ops now use OpenMP and get multi-core CPU benefits Added a single-argument version of torch.arange. For example torch.arange(10) Framework Interoperability DLPack Interoperability DLPack Tensors are cross-framework Tensor formats. We now have torch.utils.to_dlpack(x) and torch.utils.from_dlpack(x) to convert between DLPack and torch Tensor formats. The conversion has zero memory copy and hence is very efficient. Model exporter to ONNX ONNX is a common model interchange format that can be executed in Caffe2, CoreML, CNTK, MXNet, and Tensorflow at the moment. PyTorch models that are ConvNet-like and RNN-like (static graphs) can now be shipped to the ONNX format. There is a new module torch.onnx (http://pytorch.org/docs/0.3.0/onnx.html) which provides the API for exporting ONNX models. The operations supported in this release are: add, sub (nonzero alpha not supported), mul, div, cat, mm, addmm, neg, tanh, sigmoid, mean, t, transpose, view, split, squeeze expand (only when used before a broadcasting ONNX operator; e.g., add) prelu (single weight shared among input channels not supported) threshold (non-zero threshold/non-zero value not supported) Conv, ConvTranspose, BatchNorm, MaxPool, RNN, Dropout, ConstantPadNd, Negate elu, leaky_relu, glu, softmax, log_softmax, avg_pool2d unfold (experimental support with ATen-Caffe2 integration) Embedding (no optional arguments supported) RNN FeatureDropout (training mode not supported) Index (constant integer and tuple indices supported) Usability Improvements More cogent error messages during indexing of Tensors / Variables Breaking changes Add proper error message for specifying dimension on a tensor with no dimensions better error messages for Conv*d input shape checking More user-friendly error messages for LongTensor indexing Better error messages and argument checking for Conv*d routines Trying to construct a Tensor from a Variable fails more appropriately If you are using a PyTorch binary with insufficient CUDA version, then a warning is printed to the user. Fixed incoherent error messages in load_state_dict Fix error message for type mismatches with sparse tensors Bug fixes torch Fix CUDA lazy initialization to not trigger on calls to torch.manual_seed (instead, the calls are queued and run when CUDA is initialized) Tensor if x is 2D, x[[0, 3],] was needed to trigger advanced indexing. The trailing comma is no longer needed, and you can do x[[0, 3]] x.sort(descending=True) used to incorrectly fail for Tensors. Fixed a bug in the argument checking logic to allow this. Tensor constructors with numpy input: torch.DoubleTensor(np.array([0,1,2], dtype=np.float32)) torch will now copy the contents of the array in a storage of appropriate type. If types match, it will share the underlying array (no-copy), with equivalent semantics to initializing a tensor with another tensor. On CUDA, torch.cuda.FloatTensor(np.random.rand(10,2).astype(np.float32)) will now work by making a copy. ones_like and zeros_like now create Tensors on the same device as the original Tensor expand and expand_as allow expanding an empty Tensor to another empty Tensor torch.HalfTensor supports numpy() and torch.from_numpy Added additional size checking for torch.scatter Fixed random_ on CPU (which previously had a max value of 2^32) for DoubleTensor and LongTensor Fix ZeroDivisionError: float division by zero when printing certain Tensors torch.gels when m > n had a truncation bug on the CPU and returned incorrect results. Fixed. Added a check in tensor.numpy() that checks if no positional arguments are passed Before a Tensor is moved to CUDA pinned memory, added a check to ensure that it is contiguous Fix symeig on CUDA for large matrices. The bug is that not enough space was being allocated for the workspace, causing some undefined behavior. Improved the numerical stability of torch.var and torch.std by using Welford's algorithm The Random Number Generator returned uniform samples with inconsistent bounds (inconsistency in cpu implementation and running into a cublas bug). Now, all uniform sampled numbers will return within the bounds [0, 1), across all types and devices Fixed torch.svd to not segfault on large CUDA Tensors (fixed an overflow error in the magma bindings) Allows empty index Tensor for index_select (instead of erroring out) Previously when eigenvector=False, symeig returned some unknown value for the eigenvectors. Now this is corrected. sparse Fix bug with 'coalesced' calculation in sparse 'cadd' Fixes .type() not converting indices tensor. Fixes sparse tensor coalesce on the GPU in corner cases autograd Fixed crashes when calling backwards on leaf variable with requires_grad=False fix bug on Variable type() around non-default GPU input. when torch.norm returned 0.0, the gradient was NaN. We now use the subgradient at 0.0, so the gradient is 0.0. Fix an correctness issue with advanced indexing and higher-order gradients torch.prod's backward was failing on the GPU due to a type error, fixed. Advanced Indexing on Variables now allows the index to be a LongTensor backed Variable Variable.cuda() and Tensor.cuda() are consistent in kwargs options optim torch.optim.lr_scheduler is now imported by default. nn Returning a dictionary from a nn.Module's forward function is now supported (used to throw an error) When register_buffer("foo", ...) is called, and self.foo already exists, then instead of silently failing, now raises a KeyError Fixed loading of older checkpoints of RNN/LSTM which were missing _data_ptrs attributes. nn.Embedding had a hard error when using the max_norm option. This is fixed now. when using the max_norm option, the passed-in indices are written upon (by the underlying implementation). To fix this, pass a clone of the indices to the renorm kernel. F.affine_grid now can take non-contiguous inputs EmbeddingBag can accept both 1D and 2D inputs now. Workaround a CuDNN bug where batch sizes greater than 131070 fail in CuDNN BatchNorm fix nn.init.orthogonal to correctly return orthonormal vectors when rows < cols if BatchNorm has only 1 value per channel in total, raise an error in training mode. Make cuDNN bindings respect the current cuda stream (previously raised incoherent error) fix grid_sample backward when gradOutput is a zero-strided Tensor Fix a segmentation fault when reflection padding is out of Tensor bounds. If LogSoftmax has only 1 element, -inf was returned. Now this correctly returns 0.0 Fix pack_padded_sequence to accept inputs of arbitrary sizes (not just 3D inputs) Fixed ELU higher order gradients when applied in-place Prevent numerical issues with poisson_nll_loss when log_input=False by adding a small epsilon distributed and multi-gpu Allow kwargs-only inputs to DataParallel. This used to fail: n = nn.DataParallel(Net()); out = n(input=i) DistributedDataParallel calculates num_samples correctly in python2 Fix the case of DistributedDataParallel when 1-GPU per process is used. Allow some params to be requires_grad=False in DistributedDataParallel Fixed DataParallel to specify GPUs that don't include GPU-0 DistributedDataParallel's exit doesn't error out anymore, the daemon flag is set. Fix a bug in DistributedDataParallel in the case when model has no buffers (previously raised incoherent error) Fix __get_state__ to be functional in DistributedDataParallel (was returning nothing) Fix a deadlock in the NCCL bindings when GIL and CudaFreeMutex were starving each other Among other fixes,model.zoo.load_url now first attempts to use the requests library if available, and then falls back to urllib. To download the source code, click here.
Read more
  • 0
  • 0
  • 5347

article-image-6th-dec-17-headlines
Packt Editorial Staff
06 Dec 2017
6 min read
Save for later

6th Dec.' 17 - Headlines

Packt Editorial Staff
06 Dec 2017
6 min read
PyTorch v.3.0, IBM's Power Systems Servers, Core ML's support for TensorFlow Lite, Microsoft using AMD's EPYC processors, and Google's new machine learning services for text and video among today's top data science news. PyTorch removes Stochastic functions Pytorch 0.3.0 released with performance improvements, ONNX/CUDA 9/CUDNN 7 Support and bug fixes Pytorch has released its version 0.3.0 with several performance improvements, new layers, ship models to other frameworks (via ONNX), CUDA9, CuDNNv7, and “lots of bug fixes.” Among the most important changes, Pytorch has removed stochastic functions, i.e. Variable.reinforce() because of their limited functionality and broad performance implications. “The motivation for stochastic functions was to avoid book-keeping of sampled values. In practice, users were still book-keeping in their code for various reasons. We constructed an alternative, equally effective API, but did not have a reasonable deprecation path to the new API. Hence this removal is a breaking change,” the Python package said, adding that they have introduced the torch.distributions package to replace stochastic functions. Among the other changes, Pytorch said that in v0.3.0, some loss functions can compute per-sample losses in a mini-batch, and that more loss functions will be covered in the next release. There is also an in-built Profiler in the autograd engine which works for both CPU and CUDA models. In addition to new API changes, Pytorch 0.3.0 will see big reduction in framework overhead and 4x to 256x faster Softmax/LogSoftmax, apart from new tensor features. PyTorch models that are ConvNet-like and RNN-like (static graphs) can now be shipped to the ONNX format, a common model interchange format that can be executed in Caffe2, CoreML, CNTK, MXNet, and Tensorflow. AMD processors coming to Azure machines Microsoft Azure is first global cloud provider to deploy AMD EPYC processors Microsoft is the first global cloud provider which will use AMD's EPYC platform to power its data centers. In an official announcement, Microsoft said it has worked closely with AMD to develop the next generation of storage optimized VMs called Lv2-Series, powered by AMD’s EPYC processors. The Lv2-Series is designed to support customers with demanding workloads like MongoDB, Cassandra, and Cloudera that are storage intensive and demand high levels of I/O. Lv2-Series VM’s use the AMD EPYC 7551 processor, featuring a core frequency of 2.2Ghz and a maximum single-core turbo frequency of 3.0GHz. Lv2-Series VMs will come in sizes ranging up to 64 vCPU’s and 15TB of local resource disk. IBM’s Power Systems Servers speeds up deep learning training by 4x Power System AC922: IBM takes deep learning to next level with first Power9-based systems In its quest to be the AI-workload leader for data centers, IBM unveiled its first Power9 server, the Power System AC922, at the AI Summit in New York. It runs a version of the Power9 chip tuned for Linux, with the four-way multithreading variant SMT4. Power9 chips with SMT4 can offer up to 24 cores, though the chips in the AC922 top out at 22 cores. The fastest Power9 in the AC922 runs at 3.3GHz. The air-cooled AC922 model 8335-GTG set for release in mid-December, as well as two other models (one air-cooled and one water-cooled) scheduled to ship in the second quarter next year, offer two Power9 chips each and run Red Hat and Ubuntu Linux. In 2018, IBM plans to release servers with a version of the Power9 tuned for AIX and System i, with SMT8 eight-way multithreading and PowerVM virtualization, topping out at 12 cores but likely running at faster clock speeds. The Power9 family is the first processor line to support a range of new I/O technologies, including PCI-Express 4.0 and NVLink 2.0, as well as OpenCAPI. IBM claims that the Power Systems Servers can make the training of deep learning frameworks four times faster. The U.S. Department of Energy's Summit and Sierra supercomputers, at Oak Ridge National Laboratories and Lawrence Livermore National Laboratory, respectively, are also based on Power9. AI perfects imperfect-information game Study on AI’s win in heads-up no-limit Texas hold’em poker wins Best Paper award at NIPS2017 A detailed research on how the AI Libratus defeated the best human players at Heads-Up No-Limit Texas Hold'em poker game earlier this year has won the Best Paper award at NIPS2017. The paper delves deep into the analysis behind imperfect-information game AI vs perfect-information games such as Chess & Go, and expounds the idea that was used to defeat top humans in heads-up no-limit Texas hold’em poker. Earlier this year in January, the artificial intelligence system Libratus, developed by a team at Carnegie Mellon University, beat four professional poker players. The complete paper is available here on arxiv. No more "versus" between Core ML and Tensorflow Lite Google announces Apple's Core ML support in TensorFlow Lite In November, Google announced the developer preview of TensorFlow Lite. Now, Google has collaborated with Apple to add support for Core ML in TensorFlow Lite. With this announcement, iOS developers can leverage the strengths of Core ML for deploying TensorFlow models. In addition, TensorFlow Lite will continue to support cross-platform deployment, including iOS, through the TensorFlow Lite format (.tflite) as described in the original announcement. Support for Core ML is provided through a tool that takes a TensorFlow model and converts it to the Core ML Model Format (.mlmodel). For more information, users can check out the TensorFlow Lite documentation pages, and the Core ML converter. The pypi pip installable package is available at this link: https://pypi.python.org/pypi/tfcoreml/0.1.0. Google launches new machine learning services for analyzing video and text content Google announces Cloud Video Intelligence and Cloud Natural Language Content Classification are now generally available Google has announced the general availability of two new machine learning services: Cloud Video Intelligence and Cloud Natural Language Content Classification. Cloud Video Intelligence is a machine learning application programming interface that’s designed to analyze video content, while Cloud Natural Language Content Classification is an API that helps classify content into more than 700 different categories. Google Cloud Video Intelligence was launched in beta in March this year, and has since been fine-tuned for greater accuracy and deeper analysis. “We’ve been working closely with our beta users to improve the model’s accuracy and discover new ways to index, search, recommend and moderate video content. Cloud Video Intelligence is now capable of deeper analysis of your videos — everything from shot change detection, to content moderation, to the detection of 20,000 labels,” Google said. Its code is available on GitHub here. On the other hand, Google’s Content Classification for Cloud Natural Language service is designed for text-based content. Announced in September, its main job is to read through texts and categorize them appropriately. The API can be used to sort documents into more than 700 different categories, such as arts and entertainment, hobbies and leisure, law and government, news and many more.
Read more
  • 0
  • 0
  • 1053