Data | 0 articles | Tech News, Tutorials & Expert Insights

23 Oct 2017

7 min read

"My Favorite Tools to Build a Blockchain App" - Ed, The Engineer

23 Oct 2017

Hey! It’s great seeing you here. I am Ed, the Engineer and today I’m going to open up my secret toolbox and share some great tools I use to build Blockchains. If you’re a Blockchain developer or a developer-to-be, you’ve come to the right place! If you are not one, maybe you should consider becoming one. “There are only 5,000 developers dedicated to writing software for cryptocurrencies, Bitcoin, and blockchain in general. And perhaps another 20,000 had dabbled with the technology, or have written front end applications that connect with the blockchain.” - William Mougayar, The Business Blockchain Decentralized apps or dapps, as they are fondly called, are serverless applications that can be run on the client-side, within a blockchain based distributed network. We’re going to learn what the best tools are to build dapps and over the next few minutes, we’ll take these tools apart one by one. For a better understanding of where they fit into our development cycle, we’ll group them up into stages - just like the buildings we build. So, shall we begin? Yes, we can!! ;) The Foundation: Platforms The first and foremost element for any structure to stand tall and strong is its foundation. The same goes for Blockchain apps. Here, in place of all the mortar and other things, we’ve got Decentralized and Public blockchains. There are several existing networks on the likes of Bitcoin, Ethereum or Hyperledger that can be used to build dapps. Ethereum and Bitcoin are both decentralized, public chains that are open source, while Hyperledger is private and also open source. Bitcoin may not be a good choice to build dapps on as it was originally designed for peer-to-peer transactions and not for building smart contracts. The Pillars of Concrete: Languages Now, once you’ve got your foundation in place, you need to start raising pillars that will act as the skeleton for your applications. How do we do this? Well, we’ve got two great languages specifically for building dapps. Solidity It’s an object-oriented language that you can use for writing smart contracts. The best part of Solidity is that you can use it across all platforms - making it the number one choice for many developers to use. It’s a lot like JavaScript and way more robust than other languages. Along with Solidity, you might want to use Solc, the compiler for Solidity. At the moment, Solidity is the language that’s getting the most support and has the best documentation. Serpent Before the dawn of Solidity, Serpent was the reigning language for building dapps. Something like how bricks replaced stone to build massive structures. Serpent though is still being used in many places to build dapps and it has great real-time garbage collection. The Transit Mixers: Frameworks After you choose your language to build dapps, you need a framework to simplify the mixing of concrete to build your pillars. I find these frameworks interesting: Embark This is a framework for Ethereum you can use to quicken development and to streamline the process by using tools or functionalities. It allows you to develop and deploy dapps easily, or even build a serverless HTML5 application that uses decentralized technology. It equips you with tools to create new smart contracts which can be made available in JavaScript code. Truffle Here is another great framework for Ethereum, which boasts of taking on the task of managing your contract artifacts for you. It includes support for the library that links complex Ethereum apps and provides custom deployments. The Contractors: Integrated Development Environments Maybe, you are not the kind that likes to build things from scratch. You just need a one-stop place where you can tell what kind of building you want and everything else just falls in place. Hire a contractor. If you’re looking for the complete package to build dapps, there are two great tools you can use, Ethereum Studio and Remix (Browser-Solidity). The IDE takes care of everything - right from emulating the live network to testing and deploying your dapps. Ethereum Studio This is an adapted version of Cloud9, built for Ethereum with some additional tools. It has a blockchain emulator called the sandbox, which is great for writing automated tests. Fair warning: You must pay for this tool as it’s not open source and you must use Azure Cloud to access it. Remix This can pretty much do the same things that Ethereum Studio can. You can run Remix from your local computer and allow it to communicate with an Ethereum node client that’s on your local machine. This will let you execute smart contracts while connected to your local blockchain. Remix is still under development during the time of writing this article. The Rebound Hammer: Testing tools Nothing goes live until it’s tried and tested. Just like the rebound hammer you may use to check the quality of concrete, we have a great tool that helps you test dapps. Blockchain Testnet For testing purposes, use the testnet, an alternative blockchain. Whether you want to create a new dapp using Ethereum or any other chain, I recommend that you use the related testnet, which ideally works as a substitute in place of the true blockchain that you will be using for the real dapp. Testnet coins are different from actual bitcoins, and do not hold any value, allowing you as a developer or tester to experiment, without needing to use real bitcoins or having to worry about breaking the primary bitcoin chain. The Wallpaper: dapp Browsers Once you’ve developed your dapp, it needs to look pretty for the consumers to use. Dapp browsers are mostly the User Interfaces for the Decentralized Web. Two popular tools that help you bring dapps to your browser are Mist and Metamask. Mist It is a popular browser for decentralized web apps. Just as Firefox or Chrome are for the Web 2.0, the Mist Browser will be for the decentralized Web 3.0. Ethereum developers would be able to use Mist not only to store Ether or send transactions but to also deploy smart contracts. Metamask With Metamask, you can comfortably run dapps in your browser without having to run a full Ethereum node. It includes a secure identity vault that provides a UI to manage your identities on various sites, as well as sign blockchain contracts. There! Now you can build a Blockchain! Now you have all the tools you need to make amazing and reliable dapps. I know you’re always hungry for more - this Github repo created by Christopher Allen has a great listing of tools and resources you can use to begin/improve your Blockchain development skills. If you’re one of those lazy-but-smart folks who want to get things done at the click of a mouse button, then BaaS or Blockchain as a Service is something you might be interested in. There are several big players in this market at the moment, on the likes of IBM, Azure, SAP and AWS. BaaS is basically for organizations and enterprises that need blockchain networks that are open, trusted and ready for business. If you go the BaaS way, let me warn you - you’re probably going to miss out on all the fun of building your very own blockchain from scratch. With so many banks and financial entities beginning to set up their blockchains for recording transactions and transfer of assets, and investors betting billions on distributed ledger-related startups, there are hardly a handful of developers out there, who have the required skills. This leaves you with a strong enough reason to develop great blockchains and sharpen your skills in the area. Our Building Blockchain Projects book should help you put some of these tools to use in building reliable and robust dapps. So what are you waiting for? Go grab it now and have fun building blockchains!

0
2
3591

article-image-top-4-chatbot-development-frameworks-developers

Sugandha Lahoti

20 Oct 2017

8 min read

Top 4 chatbot development frameworks for developers

Sugandha Lahoti

20 Oct 2017

8 min read

The rise of the bots is nigh! If you can imagine a situation involving a dialog, there is probably a chatbot for that. Just look at the chatbot market - text-based email/SMS bots, voice-based bots, bots for customer support, transaction-based bots, entertainment bots and many others. A large number of enterprises, from startups to established organizations, are seeking to invest in this sector. This has also led to an increase in the number of platforms used for chatbot building. These frameworks incorporate AI techniques along with natural language processing capabilities to assist developers in building and deploying chatbots. Let’s start with how a chatbot typically works before diving into some of the frameworks. Understand: The first step for any chatbot is to understand the user input. This is made possible using pattern matching and intent classification techniques. ‘Intents’ are the tasks that users might want to perform with a chatbot. Machine learning, NLP and speech recognition techniques are typically used to identify the intent of the message and extract named entities. Entities are the specific pieces of information extracted from the user’s response i.e. the content associated with an intent. Respond: After understanding, the next goal is to generate a response. This is based on the current input message and the context of the conversation. After specifying the intents and entities, a dialog flow is constructed. This is basically the replies/feedback expected from a chatbot. Learn: Chatbots use AI techniques such as natural language understanding and pattern recognition to store and distinguish between the context of the information provided, and elicit a suitable response for future replies. This is important because different requests might have different meanings depending on previous requests. Top chatbot development frameworks A bot development framework is a set of predefined classes, functions, and utilities that a developer can use to build chatbots easier and faster. They vary in the level of complexity, integration capabilities, and functionalities. Let us look at some of the development platforms utilized for chatbot building. API.AI API.AI, a code based framework with a simple web-based interface, allows users to build engaging voice and text-based conversational apps using a large number of libraries and SDKs including Android, iOS, Webkit HTML5, Node.js, and Python API. It also supports nearly 32 one-click platform integrations such as Google, Facebook Messenger, Twitter and Skype to name a few. API.AI makes use of an agent - a container that transforms natural language based user requests into actionable data. The software tries to find the intent behind a user’s reply and matches it to the default or the closest match. After intent matching, it executes the actions and responses the developer has defined for that intent. API.AI also makes use of entities. Once the intents and entities are specified, the bot is trained. API.AI’s training module efficiently tracks each user’s request and lets developers see how they are parsed and matched to an intent. It also allows for correction of any errors and change requests thus retraining the bot. API.AI streamlines the entire bot-creating process by helping developers provide domain-specific knowledge that is unique to a bot’s needs while working on speech recognition, intent and context management in the backend. Google has recently partnered with API.AI to help them build conversational tools like Apple’s Siri. Microsoft Bot Framework Microsoft Bot Framework allows building and deployment of chatbots across multiple platforms and services such as web, SMS, non-Microsoft platforms, Office 365, Skype etc. The Bot Framework includes two components - The Bot Builder and the Microsoft Cognitive Services. The Bot Builder comprises of two full-featured SDKs - for the.NET and the Node.js platforms along with an emulator for testing and debugging. There’s also a set of RESTful APIs for building code in other languages. The SDKs support features for simple and easy interactions between bots. They also have a large collection of prebuilt sample bots for the developer to choose from. The Microsoft Cognitive Services is a collection of intelligent APIs that simplify a variety of AI tasks such as allowing the system to understand and interpret the user's needs using natural language in just a few lines of code. These APIs allow integration to most modern languages and platforms and constantly improve, learn, and get smarter. Microsoft created the AI Inner Circle Partner Program to work hand in hand with industry to create AI solutions. Their only partner in the UK is ICS.AI who build conversational AI solutions for the UK's public sector. ICS are the first choice for many organisations due to their smart solutions that scale and serve to improve services for the general public. Developers can build bots in the Bot Builder SDK using C# or Node.js. They can then add AI capabilities with Cognitive Services. Finally, they can register the bots on the developer portal, connecting it to users across platforms such as Facebook and Microsoft Teams and also deploy it on the cloud like Microsoft Azure. For a step-by-step guide for chatbot building using Microsoft Bot Framework, you can refer to one of our books on the topic. Sabre Corporation, a customer service provider for travel agencies, have recently announced the development of an AI-powered chatbot that leverages Microsoft Bot Framework and Microsoft Cognitive Services. Watson Conversation IBM’s Watson Conversation helps build chatbot solutions that understand natural-language input and use machine learning to respond to customers in a way that simulates conversations between humans. It is built on a neural network of one million Wikipedia words. It offers deployment across a variety of platforms including mobile devices, messaging platforms, and robots. The platform is robust and secure as IBM allows users to opt out of data sharing. The IBM Watson Tone Analyzer service can help bots understand the tone of the user’s input for better management of the experience. The basic steps to create a chatbot using Watson Conversation are as follows. We first create a workspace - a place for configuring information to maintain separate intents, user examples, entities, and dialogues for each application. One workspace corresponds to one bot. Next, we create Intents. Watson Conversation makes use of multiple conditioned responses to distinguish between similar intents. For example, instead of building specific intents for locations of different places, it creates a general intent “location” and adds an entity to capture the response, like the “location- bedroom” - to the right, near the stairs, “location-kitchen”- to the left. The third step is entity establishment. This involves grouping entities that might trigger a similar response in the dialog. The dialog flow, thus generated after specifying the intents and entities, goes through testing followed by embedding this into an application. It is then connected with other services by using the conversation API. Staples, an office supply retailing firm, uses Watson Conversation in their “Easy Systems” to simplify the customer’s shopping experience. CXP Designer and Aspect NLU Aspect Customer Experience Platform is an application lifecycle management tool to build text and voice-based applications such as chatbots. It provides deployment options across multiple communication channels like text, voice, mobile web and social media networks. The Aspect CXP typically includes a CXP designer to build chatbots and the inbuilt Aspect NLU to provide advanced natural language capabilities. CXP designer works by creating dialog objects to provide a menu of options for frontend as well as backend. Menu items for the frontend are used to create intents and modules within those intents. The developer can then modify labels (of those intents and modules) manually or use the Aspect NLU to disambiguate similar questions for successful extraction of meaning and intent. The Aspect NLU includes tools for spelling correction, linguistic lexicons such as nouns, verbs etc. and options for detecting and extracting common data types such as date, time, numbers, etc. It also allows a developer to modify the meaning extraction based on how they want it if they want it! CXP designer also allows skipping of certain steps in chatbots. For instance, if the user has already provided their tracking id for a particular package, the chatbot will skip the prompt of asking them the tracking id again. With Aspect CXP, developers can create and deploy complex chatbots. Radisson Blu Edwardian, a hotel in London, has collaborated with Aspect software to build an SMS based, AI virtual host. Conclusion Another popular chatbot development platform worth mentioning is the Facebook messenger with over 100,000 monthly active bots, but without cross-platform deployment features. The above bot frameworks are typically used by developers to build chatbots from scratch and require some programming skills. However, there has been a rise in automated bot development tools of late. Some of these include Chatfuel and Motion AI and typically involve drag and drop functionalities. With such tools, beginners and non-programmers can create and deploy chatbots within few minutes. But, they lack the extended functionalities supported by typical code based frameworks such as the flexibility to store data, produce analytics or incorporate customized AI tasks. Every chatbot development system, whether framework or tool, serves a different purpose. Choosing the right one depends on the type of application to build, organizational needs, and the developer’s expertise.

0
0
7658

article-image-introducing-intelligent-apps-a-smarter-way-into-the-future

Amarabha Banerjee

19 Oct 2017

6 min read

Introducing Intelligent Apps

Amarabha Banerjee

19 Oct 2017

6 min read

We are a species obsessed with ‘intelligence’ since gaining consciousness. We have always been inventing ways to make our lives better through sheer imagination and application of our intelligence. Now, it comes as no surprise that we want our modern day creations to be smart as well - be it a web app or a mobile app. The first question that comes to mind then is what makes an application ‘intelligent’? A simple answer for budding developers is that intelligent apps are apps that can take intuitive decisions or provide customized recommendations/experience to their users based on insights drawn from data collected from their interaction with humans. This brings up a whole set of new questions: How can intelligent apps be implemented, what are the challenges, what are the primary application areas of these so-called Intelligent apps, and so on. Let’s start with the first question. How can intelligence be infused into an app? The answer has many layers just like an app does. The monumental growth in data science and its underlying data infrastructure has allowed machines to process, segregate and analyze huge volumes of data in limited time. Now, it looks set to enable machines to glean meaningful patterns and insights from the very same data. One such interesting example is predicting user behavior patterns. Like predicting what movies or food or brand of clothing the user might be interested in, what songs they might like to listen to at different times of their day and so on. These are, of course, on the simpler side of the spectrum of intelligent tasks that we would like our apps to perform. Many apps currently by Amazon, Google, Apple, and others are implementing and perfecting these tasks on a day-to-day basis. Complex tasks are a series of simple tasks performed in an intelligent manner. One such complex task would be the ability to perform facial recognition, speech recognition and then use it to perform relevant daily tasks, be it at home or in the workplace. This is where we enter the realm of science fiction where your mobile app would recognise your voice command while you are driving back home and sends automated instructions to different home appliances, like your microwave, AC, and your PC so that your food is served hot when you reach home, your room is set at just the right temperature and your PC has automatically opened the next project you would like to work on. All that happens while you enter your home keys-free thanks to a facial recognition software that can map your face and ID you with more than 90% accuracy, even in low lighting conditions. APIs like IBM Watson, AT&T Speech, Google Speech API, the Microsoft Face API and some others provide developers with tools to incorporate features such as those listed above, in their apps to create smarter apps. It sounds almost magical! But is it that simple? This brings us to the next question. What are some developmental challenges for an intelligent app? The challenges are different for both web and mobile apps. Challenges for intelligent web apps For web apps, choosing the right mix of algorithms and APIs that can implement your machine learning code into a working web app, is the primary challenge. plenty of Web APIs like IBM Watson, AT&T speech etc. are available to do this. But not all APIs can perform all the complex tasks we discussed earlier. Suppose you want an app that successfully performs both voice and speech recognition and then also performs reinforcement learning by learning from your interaction with it. You will have to use multiple APIs to achieve this. Their integration into a single app then becomes a key challenge. Here is why. Every API has its own data transfer protocols and backend integration requirements and challenges. Thus, our backend requirement increases significantly, both in terms of data persistence and dynamic data availability and security. Also, the fact that each of these smart apps would need customized user interface designs, poses a challenge to the front end developer. The challenge is to make a user interface so fluid and adaptive that it supports the different preferences of different smart apps. Clearly, putting together a smart web app is no child’s play. That’s why, perhaps, smart voice-controlled apps like Alexa are still merely working as assistants and providing only predefined solutions to you. Their ability to execute complex voice-based tasks and commands is fairly low, let alone perform any non-voice command based task. Challenges for intelligent mobile apps For intelligent mobile apps, the challenges are manifold. A key reason is network dependency for data transfer. Although the advent of 4G and 5G mobile networks has greatly improved mobile network speed, the availability of network and the data transfer speeds still pose a major challenge. This is due to the high volumes of data that intelligent mobile apps require to perform efficiently. To circumvent this limitation, vendors like Google are trying to implement smarter APIs in the mobile’s local storage. But this approach requires a huge increase in the mobile chip’s computation capabilities - something that’s not currently available. Maybe that’s why Google has also hinted at jumping into the chip manufacturing business if their computation needs are not met. Apart from these issues, running multiple intelligent apps at the same time would also require a significant increase in the battery life of mobile devices. Finally, comes the last question. What are some key applications of intelligent apps? We have explored some areas of application in the previous sections keeping our focus on just web and mobile apps. Broadly speaking, whatever makes our daily life easier, is ideally a potential application area for intelligent apps. From controlling the AC temperature automatically to controlling the oven and microwave remotely using the vacuum cleaner (of course the vacuum cleaner has to have robotic AI capabilities) to driving the car, everything falls in the domain of intelligent apps. The real questions for us are What can we achieve with our modern computation resources and our data handling capabilities? How can mobile computation capabilities and chip architecture be improved drastically so that we can have smart apps perform complex tasks faster and ease our daily workflow? Only the future holds the answer. We are rooting for the day when we will rise to become a smarter race by delegating lesser important yet intelligent tasks to our smarter systems by creating intelligent web and mobile apps efficiently and effectively. The culmination of these apps along with hardware driven AI systems could eventually lead to independent smart systems - a topic we will explore in the coming days.

0
0
2575

article-image-ai-chip-wars-brainwave-microsofts-answer-googles-tpu

Amarabha Banerjee

18 Oct 2017

5 min read

AI chip wars: Is Brainwave Microsoft's Answer to Google's TPU?

Amarabha Banerjee

18 Oct 2017

5 min read

When Google decided to design their own chip with TPU, it generated a lot of buzz for faster and smarter computations with its ASIC-based architecture. Google claimed its move would significantly enable intelligent apps to take over, and industry experts somehow believed a reply from Microsoft was always coming (remember Bing?). Well, Microsoft has announced its arrival into the game – with its own real-time AI-enabled chip called Brainwave. Interestingly, as the two tech giants compete in chip manufacturing, developers are certainly going to have more options now, while facing the complex computational processes of modern day systems. What is Brainwave? Until recently, Nvidia was the dominant market player in the microchip segment, creating GPUs (Graphics Processing Unit) for faster processing and computation. But after Google disrupted the trend with its TPU (tensor processing unit) processor, the surprise package in the market has come from Microsoft. More so because its ‘real-time data processing’ Brainwave chip claims to be faster than the Google chip (the TPU 2.0 or the Cloud TPU chip). The one thing that is common between both Google and Microsoft chips is that they can both train and simulate deep neural networks much faster than any of the existing chips. The fact that Microsoft has claimed that Brainwave supports Real-Time AI systems with minimal lag, by itself raises an interesting question - are we looking at a new revolution in the microchip industry? The answer perhaps lies in the inherent methodology and architecture of both these chips (TPU and Brainwave) and the way they function. What are the practical challenges of implementing them in real-world applications? The Brainwave Architecture: Move over GPU, DPU is here In case you are wondering what the hype with Microsoft’s Brainwave chip is about, the answer lies directly in its architecture and design. The present-day complex computational standards are defined by high-end games for which GPUs (Graphical Processing Units) were originally designed. Brainwave differs completely from the GPU architecture: the core components of a Brainwave chip are Field Programmable Gate Arrays or FPGAs. Microsoft has developed a huge number of FPGA modules on top of which DNN (Deep Neural Network) layers are synthesized. Together, this setup can be compared with something similar to Hardware Microservices where each task is assigned by a software to different FPGA and DNN modules. These software controlled Modules are called DNN Processing Units or DPUs. This eliminates the latency of the CPU and the need for data transfer to and fro from the backend. The two methodologies involved here are seemingly different in their architecture and application: one is the hard DPU and the other is the Soft DPU. While Microsoft has used the soft DPU approach where the allocation of memory modules are determined by software and the volume of data at the time of processing, the hard DPU has a predefined memory allocation which doesn’t allow for flexibility so vital in real-time processing. The software controlled feature is exclusive to Microsoft, and unlike other AI processing chips, Microsoft have developed their own easy to process data types that are faster to process. This enables the Brainwave chip to perform near real-time AI computations easily. Thus, in a way Microsoft brainwave holds an edge over the Google TPU when it comes to real-time decision making and computation capabilities. Brainwave’s edge over TPU 2 - Is it real time? The reason Google had ventured out into designing their own chips was their need to increase the number of data centers, with the increase in user queries. They had realized the fact that instead of running data queries via data centers, it would be far more plausible if the computation was performed in the native system. That’s where they needed more computational capabilities than what the modern day market leaders like Intel X86 Xeon processors and the Nvidia Tesla K80 GPUs offered. But Google opted for Application Specific Integrated Circuits (ASIC) instead of FPGAs, the reason being that it was completely customizable. It was not specific for one particular Neural Network but was rather applicable for multiple Networks. The trade-off for this ability to run multiple Neural Networks was of course Real Time computation which Brainwave could achieve because of using the DPU architecture. The initial data released by Microsoft shows that the Brainwave has a data transfer bandwidth of 20TB/sec, 20 times faster than the latest Nvidia GPU chip. Also, the energy efficiency of Brainwave is claimed to be 4.5 times better than the current chips. Whether Google would up their ante and improve on the existing TPU architecture to make it suitable for real-time computation is something only time can tell. [caption id="attachment_1064" align="alignnone" width="644"] Source: Brainwave_HOTCHIPS2017 PPT on Microsoft Research Blog[/caption] Future outlook and challenges Microsoft is yet to declare the benchmarking results for the Brainwave chip. But Microsoft Azure customers most definitely look forward to the availability of Brainwave chip for faster and better computational abilities. What is even more promising is Brainwave works seamlessly with Google’s TensorFlow and Microsoft’s own CNTK framework. Tech startups like Rigetti, Mythic and Waves are trying to create mainstream applications which will employ AI and quantum computation techniques. This will bring AI to the masses, by creating practical AI driven applications for daily consumers, and these companies have shown a keen interest in both the Microsoft and the Google AI chips. In fact, Brainwave will be most suited for these companies such as the above which are looking to use AI capabilities for everyday tasks, as they are less in number because of the limited computational capabilities of the current chips. The challenges with all AI chips, including Brainwave, will still revolve around their data handling capabilities, the reliability of performance, and on improving memory capabilities of our current hardware systems.

0
0
3457

article-image-what-is-automated-machine-learning

Wilson D'souza

17 Oct 2017

6 min read

What is Automated Machine Learning (AutoML)?

Wilson D'souza

17 Oct 2017

6 min read

Are you a proud machine learning engineer who hates that the job tests your limits as a human being? Do you dread the long hours of data experimentation and data modeling that leave you high and dry? Automated Machine Learning or AutoML can put that smile back on your face. A self-replicating AI algorithm, AutoML is the latest tool that is being applied in the real world today, and AI market leaders such as Google have made a significant investment to research further in this field. AutoML has seen a steep rise in research and new tools over the last couple of years, but its recent mention during Google IO 2017 has piqued the interest of the entire developer community. What is AutoML all about and what makes it so interesting? Evolution of automated machine learning Before we try to understand AutoML, let’s look at what triggered the need for automated machine learning. Until now, building machine learning models that work in the real world has been a domain ruled by researchers, scientists, and machine learning experts. The process of manually designing a machine learning model involves several complex and time-consuming steps such as: Pre-processing data Selecting appropriate ML architecture Optimizing hyperparameters Constructing models Evaluating suitability of models Add to this, the several layers of neural networks required for an efficient ML architecture -- an n-layer neural network could result in nn potential networks. This level of complexity could be overwhelming for the millions of developers who are keen on embracing machine learning. AutoML tries to solve this problem of complexity and makes machine learning accessible to a large group of developers by automating routine but complex tasks such as the design of neural networks. Since this cuts down development time significantly and takes care of several complex tasks involved in building machine learning models, AutoML is expected to play a crucial role in bringing machine learning to the mainstream. Approaches to automating model generation With a growing body of research, AutoML aims to automate the following tasks in the field of machine learning: Model Selection Parameter Tuning Meta Learning Ensemble Construction It does this by using a wide range of algorithms and approaches such as: Bayesian Optimization: One of the fundamental approaches for automating model generation is to use Bayesian methods for hyperparameter tuning. By modeling the uncertainty of parameter performance, different variations of the model can be explored which offers an optimal solution. Meta-learning and Ensemble Construction: To further increase AutoML efficiency, meta-learning techniques are used to find and pick optimal hyperparameter settings. These techniques can be further coupled with auto-ensemble construction techniques to create effective ensemble model from a collection of models that undergo optimization. Using these techniques, a high level of accuracy can be achieved throughout the process of automated generation of models. Genetic Programming: Certain tools like TPOT also make use of a variation of genetic programming (tree-based pipeline optimization) to automatically design and optimize ML models that offer highly accurate results for a given set of data. This approach makes use of operators at various stages of the data pipeline which are assembled together in the form of a tree-based pipeline. These are then further optimized and newer pipelines are auto-generated using genetic programming. If these weren’t enough, Google in its recent posts disclosed that they are using reinforcement learning approach to give a further push to develop efficient AutoML techniques. What are some tools in this area? Although it’s still early days, we can already see some frameworks emerging to automate the generation of your machine learning models. Auto-sklearn: Auto-sklearn, the tool which won the ChaLearn AutoML Challenge, provides a wrapper around the popular Python library scikit-learn to automate machine learning. This is a great addition to the ever-growing ecosystem of Python data science tools. Built on top of Bayesian optimization, it takes away the hassle of algorithm selection, parameter tuning, and ensemble construction while building machine learning pipelines. With auto-sklearn, developers can create rapid iterations and refinements to their machine learning models, thereby saving a significant amount of development time. The tool is still in its early stages of development, so expect a few hiccups while using it. DataRobot: DataRobot offers a machine learning automation platform to all levels of data scientists aimed at significantly reducing the time to build and deploy predictive models. Since it’s a cloud platform it offers great power and speed throughout the process of automating the model generation process. In addition to automating the development of predictive models, it offers other useful features such as a web-based interface, compatibility with several leading tools such as Hadoop and Spark, scalability, and rapid deployment. It’s one of those few machine learning automation platforms which are ready for industry use. TPOT: TPOT is yet another Python tool meant for automated machine learning. It uses a genetic programming approach to iterate and optimize machine learning models. As in the case of auto-sklearn, TPOT is also built on top of scikit-learn. It has a growing interest level on GitHub with 2400 stars and has observed a 100% rise in the past one year alone. Its goals, however, are quite similar to those of Auto-sklearn: feature construction, feature selection, model selection, and parameter optimization. With these goals in mind, TPOT aims at building efficient machine learning systems in lesser time and with better accuracy. Will automated machine learning replace developers? AutoML as a concept is still in its infancy. But as market leaders like Google, Facebook, and others research more in this field, AutoML will keep evolving at a brisk pace. Assuming that AutoML would replace humans in the field of data science, however, is a far-fetched thought and nowhere near reality. Here is why. AutoML as a technique is meant to make the neural network design process efficient rather than replace humans and researchers in the field of building neural networks. The primary goal of AutoML is to help experienced data scientists be more efficient at their work i.e., enhance productivity by a huge margin and to reduce the steep learning curve for the many developers who are keen on designing ML models - i.e., make ML more accessible. With the advancements in this field, it’s exciting times for developers to embrace machine learning and start building intelligent applications. We see automated machine learning as a game changer with the power to truly democratize the building of AI apps. With automated machine learning, you don’t have to be a data scientist to develop an elegant AI app!

0
0
4326

article-image-neuroevolution-step-toward-thinking-machine

Amarabha Banerjee

16 Oct 2017

9 min read

Neuroevolution: A step towards the Thinking Machine

Amarabha Banerjee

16 Oct 2017

9 min read

“I propose to consider the question - Can machines think?” - Alan Turing The goal for AI research has always remained the same - create a machine that has human-like decision-making capabilities based on available information. This includes the machine’s ability to analyze and process huge amounts of data and then make a meaningful inference from it. Machine learning, deep learning and other old and new paradigms in AI research are all attempts at imparting complex decision-making capabilities to machines or systems. Alan Turing’s famous test for AI has set the standards over the years for what qualifies as a smart AI i.e. a thinking machine. The imitation game is about an AI/ bot interacting with a human anonymously, in a way that the human can’t decipher the fact that it’s a machine. This not-so-trivial test has seen many adaptations over the years like the modern day Tokyo test. These tests set challenging boundaries that machines must cross to be considered capable of possessing intelligence. Neuroevolution, a few decades old theory, remodeled in a modern day format with the help of Neural and Deep Neural Networks, promises to challenge these boundaries and even break them. With neuroevolution, machines aim to solve complex problems on their own with satisfactory levels of accuracy even though they do not know how to achieve those results. Neuroevolution: The Essence “If a wild animal habitually performs some useless activity, natural selection will favor rival individuals who instead devote time to surviving and reproducing...Ruthless utilitarianism trumps, even if it doesn’t always seem that way.” - Richard Dawkins This is the essence of Neuroevolution. But the process itself is not as simple. Just like the human evolution process, in the beginning, a set of algorithms work on a problem. The algorithms that show an inclination to solve the problem in the right way are selected for the next stage. They then undergo random minor mutations - i.e., small logical changes in the inherent algorithm structure. Next, we check whether these changes enable the algorithms to achieve the same result with better accuracy or efficiency. The successful ones then move to the next stage with further mutations introduced. This is similar to how nature did the sorting for us and humans evolved from a natural need to survive in unfamiliar situations. Since the concept uses Neural Networks, it has come to be known as Neuroevolution. Neuroevolution, in the simplest terms, is the process of “descent with modification” by which machines/systems evolve and get better at solving the problems they were built for. Backpropagation to DNN: The Evolution Neural networks are made up of nodes. These nodes function like neurons in the human brain that receive a set of inputs and generate a response based on the type, intensity, frequency etc of stimuli. A single node looks like the below illustration: An algorithm can be viewed as a node. With backpropagation, the algorithm is modified in an iterative manner - where the error generated after each pass, is fed back to the system. The algorithms (nodes) responsible for higher error contribution are identified and assigned less weight in the next pass. Thus, backpropagation is a way to assign appropriate weights to nodes by calculating error contributions of individual nodes. These nodes, when combined in different layers, form the structure of Deep Neural Networks. Deep Neural networks have separate input and output layers and a middle layer of hidden nodes which form the core of DNN. This hidden layer consists of multiple nodes like the following. In case of DNNs, as before in each iteration, the weight of the nodes are adjusted based on their accuracy. The number of iterations is a factor that varies for each DNN. As explained earlier, the system without any external stimuli continues to improve on its own. Now, where have we seen this before? Of course, this looks a lot like a simplified, miniature version of evolution! Unfit nodes are culled by reducing the weight they have in the overall output, and the ones with favorable results are encouraged, just like the natural selection. However, the only thing that is missing from this is the mutation and the ability to process mutation. This is where we introduce the mutations in the successful algorithms and let them evolve on their own. Backpropagation in DNNs doesn’t change the algorithm or it’s approach, it merely increases or decreases the algorithm’s overall contribution to the desired result. Forcing random mutations of neural and deep neural networks and then letting these mutations take shape as these neural networks together try to solve a given problem seem pretty straightforward. The point where everything starts getting messy is when different layers or neural networks start solving the given problem in their own pre-defined way. One of two things may then happen: The neural networks behave in self-contradiction and stall the overall problem-solving process. The system as such cannot take any decision and becomes dormant. The neural networks are in some sort of agreement regarding a decision. The decision itself might be correct or incorrect. Both scenarios present us with dilemmas - how to restart a stalled process and how to achieve better decision making capability. The solution to both of situations lies in enabling the DNNs to rectify themselves first by choosing the correct algorithms. And then by mutating them with an intention to allow them to evolve and reach a decision toward achieving greater accuracy. Here’s a look at some popular implementations of this idea. Neuroevolution in flesh and blood Cutting edge AI research giants like OpenAI backed by Elon Musk and Google DeepMind have taken the concept of neuroevolution and applied them to a bunch of deep neural networks. Both aim to evolve these algorithms in a way that the smarter ones survive and eventually create better and faster models & systems. Their approaches are however starkly different. The Google implementation Google’s way is simple - It takes a number of algorithms, divides them into groups and assigns one particular task to all. The algorithms that fare better at solving these problems are then chosen for the next stage, much like the reward and punishment system in reinforcement learning. However, the difference here is that the faster algorithms are not just chosen for the next step, but their models and parameters are tweaked slightly - this is our way of introducing a mutation into the successful algorithms. These minor mutations then play out as these modified algorithms try to solve the given problem. Again, the better ones remain and the rest are culled out. This way, the algorithms themselves find a way to perform better and better until they are reasonably close to the desired result. The most important advantage of this process is that the algorithms keep track of their evolution process as they get smarter. A major limitation of Google’s approach is that the time taken for performing these complex computations is too high, hence the result takes time to show. Also, once the mutation kicks in, their behavior is not controlled externally - i.e., quite literally they can go berserk because of the resulting mutation - which means the process can fail even at an advanced stage. The OpenAI implementation Let’s contrast this with OpenAI’s master-worker approach to neuroevolution. OpenAI used a set of nearly 1440 algorithms to play the game of Atari and submit their scores to the master algorithm. Then, the algorithms with better scores were chosen and given a mutation and put back into the same process. In more abstract terms, the OpenAI method looks like this. A set of worker algorithms are given a certain complex problem to solve. The best scores are passed on to the master algorithm. The better algorithms are then mutated and set to perform the same tasks. The scores are again recorded and passed on to the master algorithm. This happens through multiple iterations. The master algorithm progressively eliminates the chance of failure since the master algorithm knows which algorithms to employ when given a certain problem. However, it does not know the road to success as it has access only to the final scores and not how those scores were achieved. The advantage of this approach is that better results are guaranteed, there are no cases of decision conflict and the system stalling. The flip side is that this system only knows its way through the given problem. All this effort to evolve the system to a better one will have to be repeated for a similar but different problem. The process is therefore cumbersome and lengthy. The Future with Neuroevolution Human evolution has taken millions of years to reach where we are today. Evolving AI and enabling them to pass the Turing test, or to further make them smart enough to pass a university entrance exam will require significant improvement from the current crop of AI. Amazon’s Alexa and Apple’s Siri are mere digital assistants. If we want AI driven smart systems with seamless integration of AI into our everyday life, algorithms with evolutionary characteristics are a must. Neuroevolution might hold the secret to inventing smart AIs that can ultimately propel human civilization to greater heights of development and advancement. “It seems probable that once the machine thinking method had started, it would not take long to outstrip our feeble powers...They would be able to converse with each other to sharpen their wits. At some stage, therefore, we should have to expect the machines to take control." - Alan Turing

0
0
2274

article-image-devops-for-big-data-success

Ashwin Nair

11 Oct 2017

5 min read

DevOps might be the key to your Big Data project success

Ashwin Nair

11 Oct 2017

5 min read

So, you probably believe in the power of Big Data and the potential it has to change the world. Your company might have already invested in or is planning to invest in a big data project. That’s great! But what if I were to tell you that only 15% of the business were successfully able to deploy their Big Data projects to production. That can’t be a good sign surely! Now, don’t just go freeing up your Big Data budget. Not yet. Big Data’s Big Challenges For all the hype around Big Data, research suggests that many organizations are failing to leverage its opportunities properly. A recent survey by NewVantage partners, for example, explored the challenges facing organizations currently running their own Big Data projects or trying to adopt them. Here’s what they had to say: “In spite of the successes, executives still see lingering cultural impediments as a barrier to realizing the full value and full business adoption of Big Data in the corporate world. 52.5% of executives report that organizational impediments prevent realization of broad business adoption of Big Data initiatives. Impediments include lack or organizational alignment, business and/or technology resistance, and lack of middle management adoption as the most common factors. 18% cite lack of a coherent data strategy.” Clearly, even some of the most successful organizations are struggling to get a handle on Big Data. Interestingly, it’s not so much about gaps in technology or even skills, but rather lack of culture and organizational alignment that’s making life difficult. This isn’t actually that surprising. The problem of managing the effects of technological change is one that goes far beyond Big Data - it’s impacting the modern workplace in just about every department, from how people work together to how you communicate and sell to customers. DevOps Distilled It’s out of this scenario that we’ve seen the irresistible rise of DevOps. DevOps, for the uninitiated, is an agile methodology that aims to improve the relationship between development and operations. It aims to ensure a fluid collaboration between teams; with a focus on automating and streamlining monotonous and repetitive tasks within a given development lifecycle, thus reducing friction and saving time. We can perhaps begin to see, then, that this approach - usually used in typical software development scenarios - might actually offer a solution to some of the problems faced when it comes to big data. A typical Big Data project Like a software development project, a Big Data project will have multiple different teams working on it in isolation. For example, a big data architect will look into the project requirements and design a strategy and roadmap for implementation, while the data storage and admin team will be dedicated to setting up a data cluster and provisioning infrastructure. Finally, you’ll probably then find data analysts who process, analyse and visualize data to gain insights. Depending on the scope and complexity of your project it is possible that more teams are brought in - say, data scientists are roped in to trains and build custom machine learning models. DevOps for Big Data: A match made in heaven Clearly, there are a lot of moving parts in a typical Big Data project - each role performing considerably complex tasks. By adopting DevOps, you’ll reduce any silos that exist between these roles, breaking down internal barriers and embedding Big Data within a cross-functional team. It’s also worth noting that this move doesn’t just give you a purely operational efficiency advantage - it also gives you much more control and oversight over strategy. By building a cross-functional team, rather than asking teams to collaborate across functions (sounds good in theory, but it always proves challenging), there is a much more acute sense of a shared vision or goal. Problems can be solved together, discussions can take place constantly and effectively. With the operational problems minimized, everyone can focus on the interesting stuff. By bringing DevOps thinking into big data, you also set the foundation for what’s called continuous analytics. Taking the principle of continuous integration, fundamental to effective DevOps practice, whereby code is integrated into a shared repository after every task or change to ensure complete alignment, continuous analytics streamlines the data science lifecycle by ensuring a fully integrated approach to analytics, where as much as possible is automated through algorithms. This takes away the boring stuff - once again ensuring that everyone within the project team can focus on what’s important. We’ve come a long way from Big Data being a buzzword - today, it’s the new normal. If you’ve got a lot of data to work with, to analyze and to understand, you better make sure you’ve the right environment setup to make the most from it. That means there’s no longer an excuse for Big Data projects to fail, and certainly no excuse not to get one up and running. If it takes DevOps to make Big Data work for businesses then it’s a MINDSET worth cultivating and running with.

0
0
3029

article-image-whats-difference-between-data-scientist-and-data-analyst

Erik Kappelman

10 Oct 2017

5 min read

What's the difference between a data scientist and a data analyst

Erik Kappelman

10 Oct 2017

5 min read

It sounds like a fairly pedantic question to ask what the difference between a data scientist and data analyst is. But it isn't - in fact, it's a great question that illustrates the way data-related roles have evolved in businesses today. It's pretty easy to confuse the two job roles - there's certainly a lot of misunderstanding on the difference between a data scientist and a data analyst even within a managerial environment. Comparing data analysts and data scientists Data analysts are going to be dealing with data that you might remember from your statistics classes. This data might come from survey results, lab experiments of various sorts, longitudinal studies, or another form of social observation. Data may also come from observation of natural or created phenomenons, but the data’s form would still be similar. Data scientists on the other hand, are going to looking at things like metadata from billions of phone calls, data used to forecast Bitcoin prices that have been scraped from various places around the Internet, or maybe data related to Internet searches before and after some important event. So their data is often different, but is that all? The tools and skillset required for each is actually quite different as well. Data science is much more entwined with the field of computer science than data analysis. A good data analyst should have working knowledge of how computers, networks, and the Internet function, but they don’t need to be an expert in any of these things. Data analyst really just need to know a good scripting language that is used to handle data, like Python or R, and maybe a more mathematically advanced tool like MatLab or Mathematica for more advanced modeling procedures. A data analyst could have a fruitful career knowing only about that much in the realm of technology. Data scientists, however, need to know a lot about how networks and the Internet work. Most data scientists will need to have mastered HTTP, HTML, XML and SQL as well as scripting languages like Ruby or Python, and also object-oriented languages like Java or C. This is because data scientists spend a lot more time capturing, manipulating, storing and moving around data than a data analyst would. These tasks require a different skillset. Data analysts and data scientists have different forms of conceptual understanding There will also likely be a difference in the conceptual understanding of a data analyst versus a data scientist. If you were to ask both a data scientist and a data analyst to derive and twice differentiate the log likelihood function of the binomial logistic regression model, it is more likely the data analyst would be able to do it. I would expect data analysts to have a better theoretical understanding of statistics than a data scientist. This is because data scientists don’t really need much theoretical understanding in order to be effective. A data scientist would be better served by learning more about capturing data and analyzing streams of data than theoretical statistics. Differences are not limited to knowledge or skillset, how data scientists and data analysts approach their work is also different. Data analysts generally know what they are looking for as they begin their analysis. By this I mean, a data analyst may be given the results of a study of a new drug, and the researcher may ask the analyst to explore and hopefully quantify the impact of a new drug. A data analyst would have no problem performing this task. A data scientist on the other hand, could be given the task of analyzing locations of phone calls and finding any patterns that might exist. For the data scientist, the goal is often less defined than it is for a data analyst. In fact, I think this is the crux of the entire difference. Data scientists perform far more exploratory data analysis than their data analyst cousins. This difference in approach really explains the difference in skill sets. Data scientists have skill sets that are primarily geared toward extracting, storing and finding uses for data. The skill set to perform these tasks is the skill set of a data scientist. Data analysts primarily analyze data and their skill set reflects this. Just to add one more little wrinkle, while calling a data scientist a data analyst is basically correct, calling a data analyst a data scientist is probably not correct. This is because the data scientist is going to have a handle on more of the skills required of a data analyst than a data analyst would of a data scientist. This is another reason there is so much confusion around this subject. Clearing up the difference between a data scientist and data analyst So now, hopefully, you can tell the difference between a data scientist and a data analyst. I don’t believe either field is superior to the other. If you are choosing between which field you would like to pursue, what’s important is that you choose the field that best compliments your skill set. Luckily it's hard to go wrong because both data scientists and analysts usually have interesting and rewarding careers.

0
0
2859

Packt

09 Oct 2017

2 min read

Beyond the Bitcoin: How cryptocurrency can make a difference in hurricane disaster relief

Packt

09 Oct 2017

2 min read

More than $350 worth of cryptocurrency guides offered in support of globalgiving.com During Cybersecurity Month, Packt is partnering with Humble Bundle and three other technology publishers – Apress, John Wiley & Sons, No Starch Press - for the Humble Book Bundle: Bitcoin & Cryptocurrency, a starter eBook library of blockchain programming guides offered for as little as $1, with each purchase supporting hurricane disaster relief efforts through the nonprofit, GlobalGiving.org. Packed with over $350 worth of valuable developer information, the bundle offers coding instruction and business insights at every level – from beginner to advanced. Readers can learn how to code with Ethereum while at the same time learning about the latest developments in cryptocurrency and emerging business uses of blockchain programming. As with all Humble Bundles, customers can choose how their purchase dollars are allocated, between the publishers and charity, and can even “gift” a bundle purchase to others as their donation. Donations for as little as $1USD can support hurricane relief. The online magazine retailer, Zinio, will be offering a limited time promotion of some of their best tech magazines as well. You can find the special cryptocurrency package here. "It's very unusual for tech publishers who normally would compete to come together to do good work for a good cause," said Kelley Allen, Director of Books at Humble Bundle. "Humble Books is really pleased to be able to support their efforts by offering this collection of eBooks about such a timely and cutting-edge subject of Cryptocurrency". The package of 15 eBooks includes recent titles Bitcoin for Dummies, The Bitcoin Big Bang, BlockChain Basics, Bitcoin for the Befuddled, Mastering Blockchain, and the eBook bestseller, Introducing Ethereum and Solidity. The promotional bundles are being released globally in English, and are available in PDF, .ePub and .Mobi formats. The offer runs October 9 through October 23, 2017.

0
0
2513

article-image-what-we-learned-oracle-openworld-2017

Amey Varangaonkar

06 Oct 2017

5 min read

What we learned from Oracle OpenWorld 2017

Amey Varangaonkar

06 Oct 2017

5 min read

“Amazon’s lead is over.” These famous words by the Oracle CTO Larry Ellison in the Oracle OpenWorld 2016 garnered a lot of attention, as Oracle promised their customers an extensive suite of cloud offerings, and offered a closer look at their second generation IaaS data centers. In the recently concluded OpenWorld 2017, Oracle continued on their quest to take on AWS and other major cloud vendors by unveiling a host of cloud-based products and services. Not just that, they have juiced these offerings up with Artificial Intelligence-based features, in line with all the buzz surrounding AI. Key highlights from the Oracle OpenWorld 2017 Autonomous Database Oracle announced a totally automated, self-driving database that would require no human intervention for managing or fine-tuning the database. Using machine learning and AI to eliminate human error, the new database guarantees 99.995% availability. While taking another shot at AWS, Ellison promised in his keynote that customers moving from Amazon’s Redshift to Oracle’s database can expect a 50% cost reduction. Likely to be named as Oracle 18c, this new database is expected to be shipped across the world by December 2017. Oracle Blockchain Cloud Service Oracle joined IBM in the race to dominate the Blockchain space by unveiling its new cloud-based Blockchain service. Built on top of the Hyperledger Fabric project, the service promises to transform the way business is done by offering secure, transparent and efficient transactions. Other enterprise-critical features such as provisioning, monitoring, backup and recovery are also some of the standard features which this service will offer to its customers. “There are not a lot of production-ready capabilities around Blockchain for the enterprise. There [hasn’t been] a fully end-to-end, distributed and secure blockchain as a service,” Amit Zavery, Senior VP at Oracle Cloud. It is also worth remembering that Oracle joined the Hyperledger consortium just two months ago, and the signs of them releasing their own service were there already. Improvements to Business Management Services The new features and enhancements introduced for the business management services were one of the key highlights of the OpenWorld 2017. These features now empower businesses to manage their customers better, and plan for the future with better organization of resources. Some important announcements in this area were: Adding AI capabilities to its cloud services - The Oracle Adaptive Intelligent Apps will now make use of the AI capabilities to improve services for any kind of business Developers can now create their own AI-powered Oracle applications, making use of deep learning Oracle introduced AI-powered chatbots for better customer and employee engagement New features such as enhanced user experience in the Oracle ERP cloud and improved recruiting in the HR cloud services were introduced Key Takeaways from Oracle OpenWorld 2017 With the announcements, Oracle have given a clear signal that they’re to be taken seriously. They’re already buoyed by a strong Q1 result which saw their revenue from cloud platforms hit $1.5 billion, indicating a growth of 51% as compared to Q1 2016, Here are some key takeaways from the OpenWorld 2017, which are underlined by the aforementioned announcements: Oracle undoubtedly see cloud as the future, and have placed a lot of focus on the performance of their cloud platform. They’re betting on the fact that their familiarity with the traditional enterprise workload will help them win a lot more customers - something Amazon cannot claim. Oracle are riding on the AI wave and are trying to make their products as autonomous as possible - to reduce human intervention and human error, to some extent. With enterprises looking to cut costs wherever possible, this could be a smart move to attract more customers. The autonomous database will require Oracle to automatically fine-tune, patch, and upgrade its database, without causing any downtime. It will be interesting to see if the database can live up to its promise of ‘99.995% availability’. Is the role of Oracle DBAs going to be at risk, due to the automation? While it is doubtful that they will be out of jobs, there is bound to be a significant shift in their day to day operations. It is speculated that the DBAs would require to spend less time on the traditional administration tasks such as fine-tuning, patching, upgrading, etc. and instead focus on efficient database design, setting data policies and securing the data. Cybersecurity has been a key theme in Ellison’s keynote and the OpenWorld 2017 in general. As enterprise Blockchain adoption grows, so does the need for a secure, efficient digital transaction system. Oracle seem to have identified this opportunity, and it will be interesting to see how they compete with the likes of IBM and SAP to gain major market share. Oracle’s CEO Mark Hurd has predicted that Oracle can win the cloud wars, overcoming the likes of Amazon, Microsoft and Google. Judging by the announcements in the OpenWorld 2017, it seems like they may have a plan in place to actually pull it off. You can watch highlights from the Oracle OpenWorld 2017 on demand here. Don’t forget to check out our highly popular book Oracle Business Intelligence Enterprise Edition 12c, your one-stop guide to building an effective Oracle BI 12c system.

0
0
1888

article-image-what-is-streaming-analytics-and-why-is-it-important

Amey Varangaonkar

05 Oct 2017

5 min read

Say hello to Streaming Analytics

Amey Varangaonkar

05 Oct 2017

5 min read

In this data-driven age, businesses want fast, accurate insights from their huge data repositories in the shortest time span — and in real time when possible. These insights are essential — they help businesses understand relevant trends, improve their existing processes, enhance customer satisfaction, improve their bottom line, and most importantly, build, and sustain their competitive advantage in the market. Doing all of this is quite an ask - one that is becoming increasingly difficult to achieve using just the traditional data processing systems where analytics is limited to the back-end. There is now a burning need for a newer kind of system where larger, more complex data can be processed and analyzed on the go. Enter: Streaming Analytics Streaming Analytics, also referred to as real-time event processing, is the processing and analysis of large streams of data in real-time. These streams are basically events that occur as a result of some action. Actions like a transaction or a system failure, or a trigger that changes the state of a system at any point in time. Even something as minor or granular as a click would then constitute as an event, depending upon the context. Consider this scenario - You are the CTO of an organization that deals with sensor data from wearables. Your organization would have to deal with terabytes of data coming in on a daily basis, from thousands of sensors. One of your biggest challenges as a CTO would be to implement a system that processes and analyzes the data from these sensors as it enters the system. Here’s where streaming analytics can help you by giving you the ability to derive insights from your data on the go. According to IBM, a streaming system demonstrates the following qualities: It can handle large volumes of data It can handle a variety of data and analyze it efficiently — be it structured or unstructured, and identifies relevant patterns accordingly It can process every event as it occurs unlike traditional analytics systems that rely on batch processing Why is Streaming Analytics important? The humongous volume of data that companies have to deal with today is almost unimaginable. Add to that the varied nature of data that these companies must handle, and the urgency with which value needs to be extracted from this data - it all makes for a pretty tricky proposition. In such scenarios, choosing a solution that integrates seamlessly with different data sources, is fine-tuned for performance, is fast, reliable, and most importantly one that is flexible to changes in technology, is critical. Streaming analytics offers all these features - thereby empowering organizations to gain that significant edge over their competition. Another significant argument in favour of streaming analytics is the speed at which one can derive insights from the data. Data in a real-time streaming system is processed and analyzed before it registers in a database. This is in stark contrast to analytics on traditional systems where information is gathered, stored, and then the analytics is performed. Thus, streaming analytics supports much faster decision-making than the traditional data analytics systems. Is Streaming Analytics right for my business? Not all organizations need streaming analytics, especially those that deal with static data or data that hardly change over longer intervals of time, or those that do not require real-time insights for decision-making. For instance, consider the HR unit of a call centre. It is sufficient and efficient to use a traditional analytics solution to analyze thousands of past employee records rather than run it through a streaming analytics system. On the other hand, the same call centre can find real value in implementing streaming analytics to something like a real-time customer log monitoring system. A system where customer interactions and context-sensitive information are processed on the go. This can help the organization find opportunities to provide unique customer experiences, improve their customer satisfaction score, alongside a whole host of other benefits. Streaming Analytics is slowly finding adoption in a variety of domains, where companies are looking to get that crucial competitive advantage - sensor data analytics, mobile analytics, business activity monitoring being some of them. With the rise of Internet of Things, data from the IoT devices is also increasing exponentially. Streaming analytics is the way to go here as well. In short, streaming analytics is ideal for businesses dealing with time-critical missions and those working with continuous streams of incoming data, where decision-making has to be instantaneous. Companies that obsess about real-time monitoring of their businesses will also find streaming analytics useful - just integrate your dashboards with your streaming analytics platform! What next? It is safe to say that with time, the amount of information businesses will manage is going to rise exponentially, and so will the nature of this information. As a result, it will get increasingly difficult to process volumes of unstructured data and gain insights from them using just the traditional analytics systems. Adopting streaming analytics into the business workflow will therefore become a necessity for many businesses. Apache Flink, Spark Streaming, Microsoft's Azure Stream Analytics, SQLstream Blaze, Oracle Stream Analytics and SAS Event Processing are all good places to begin your journey through the fleeting world of streaming analytics. You can browse through this list of learning resources from Packt to know more. Learning Apache Flink Learning Real Time processing with Spark Streaming Real Time Streaming using Apache Spark Streaming (video) Real Time Analytics with SAP Hana Real-Time Big Data Analytics

0
0
1934

article-image-top-5-misconceptions-about-data-science

Erik Kappelman

02 Oct 2017

6 min read

Top 5 misconceptions about data science

Erik Kappelman

02 Oct 2017

6 min read

Data science is a well-defined, serious field of study and work. But the term ‘data science’ has become a bit of a buzzword. Yes, 'data scientists’ have become increasingly important to many different types of organizations, but it has also become a trend term in tech recruitment. The fact that these words are thrown around so casually has led to a lot of confusion about what data science and data scientists actually is and are. I would formerly include myself in this group. When I first heard the word data scientist, I assumed that data science was actually just statistics in a fancy hat. Turns out I was quite wrong. So here are the top 5 misconceptions about data science. Data science is statistics and vice versa I fell prey to this particular misconception myself. What I have come to find out is that statistical methods are used in data science, but conflating the two is really inaccurate. This would be somewhat like saying psychology is statistics because research psychologists use statistical tools in studies and experiments. So what's the difference? I am of the mind that the primary difference lies in the level of understanding of computing required to succeed in each discipline. While many statisticians have an excellent understanding of things like database design, one could be a statistician and actually know nothing about database design. To succeed as a statistician, all the way up to the doctoral level, you really only need to master basic modeling tools like R, Python, and MatLab. A data scientist needs to be able to mine data from the Internet, create machine learning algorithms, design, build and query databases and so on. Data science is really computer science This is the other half of the first misconception. While it is tempting to lump data science in with computer science, the two are quite different. For one thing, computer science is technically a field of mathematics focused on algorithms and optimization, and data science is definitely not that. Data science requires many skills that overlap with those of computer scientists, but data scientists aren’t going to need to know anything about computer hardware, kernels, and the like. A data scientist ought to have some understanding of network protocols, but even here, the level of understanding required for data science is nothing like the understanding held by the average computer scientist. Data scientists are here to replace statisticians In this case, nothing could be further from the truth. One way to keep this straight is that statisticians are in the business of researching existing statistical tools as well as trying to develop new statistical tools. These tools are then turned around and used by data scientists and many others. Data scientists are usually more focused on applied solutions to real problems and less interested in what many might regard as pure research. Data science is primarily focused on big data This is an understandable misconception. Just so we’re clear, Wikipedia defines big data as “a term for data sets that are so large or complex that traditional data processing application software is inadequate to deal with them.” Then big data is really just the study of how to deal with, well, big datasets. Data science absolutely has a lot to contribute in this area. Data scientists usually have skills that work really well when it comes to analyzing big data. Skills related to databases, machine learning, and how data is transferred around a local network or the internet, are skills most data scientists have, and are very helpful when dealing with big data. But data science is actually very broad in scope. big data is a hot topic right now and receiving a lot of attention. Research into the field is receiving a lot private and public funding. In any situation like this, many different types of people working in a diverse range of areas are going to try to get in on the action. As a result, talking up data science's connection to big data makes sense if you're a data scientist - it's really about effective marketing. So, you might work with big data if you're a data scientist - but data science is also much, much more than just big data. Data scientists can easily find a job I thought I would include this one to add a different perspective. While there are many more misconceptions about what data science is or what data scientists do, I think this is actually a really damaging misconception and should be discussed. I hear a lot of complaints these days from people with some skill set that is sought after not being able to find gainful employment. Data science is like any other field, and there is always going to be a whole bunch of people that are better at it than you. Don’t become a data scientist because you’re sure to get a job - you’re not. The industries related to data science are absolutely growing right now, and will continue to do so for the foreseeable future. But that doesn’t mean people who can call themselves data scientists just automatically get jobs. You have to have the talent, but you also need to network and do all the same things you need to do to get on in any other industry. The point is, it's not easy to get a job no matter what your field is; study and practice data science because it's awesome, don’t do it because you heard it’s a sure way to get a job. Misconceptions abound, but data science is a wonderful field of research, study, and practice. If you are interested in pursuing a career or degree related to data science, I encourage you to do so, however, make sure you have the right idea about what you’re getting yourself into. Erik Kappelman wears many hats including blogger, developer, data consultant, economist, and transportation planner. He lives in Helena, Montana and works for theDepartment of Transportation as a transportation demand modeler.

0
0
1923

Antonio Cucciniello

02 Oct 2017

4 min read

What is coding as a service?

Antonio Cucciniello

02 Oct 2017

4 min read

What is coding as a service? If you want to know what coding as a service is, you have to start with Artificial intelligence. Put simply, coding-as-a-service is using AI to build websites, using your machine to write code so you don't have to. The challenges facing engineers and programmers today In order to give you a solid understanding of what coding as a service is, you must understand where we are today. Typically, we have programs that are made by software developers or engineers. These programs are usually created to automate a task or make tasks easier. Think things that typically speed up processing or automate a repetitive task. This is, and has been, extremely beneficial. The gained productivity from the automated applications and tasks allows us, as humans and workers, to spend more time on creating important things and coming up with more ground breaking ideas. This is where Artificial Intelligence and Machine Learning come into the picture. Artificial intelligence and coding as a service Recently, with the gains in computing power that have come with time and breakthroughs, computers have became more and more powerful, allowing for AI applications to arise in more common practice. At this point today, there are applications that allow for users to detect objects in images and videos in real-time, translate speech to text, and even determine the emotions in the text sent by someone else. For an example of Artificial Intelligence Applications in use today, you may have used an Amazon Alexa or Echo Device. You talk to it, and it can understand your speech, and it will then complete a task based off your speech. Previously, this was a task given to only humans (the ability to understand speech.). Now with advances, Alexa is capable of understanding everything you say,given that it is "trained" to understand it. This development, previously only expected of humans, is now being filtered through to technology. How coding as a service will automate boring tasks Today, we have programmers that write applications for many uses and make things such as websites for businesses. As things progress and become more and more automated, that will increase programmer’s efficiency and will reduce the need for additional manpower. Coding as a service, other wise known as Caas, will result in even fewer programmers needed. It mixes the efficiencies we already have with Artificial Intelligence to do programming tasks for a user. Using Natural Language Processing to understand exactly what the user or customer is saying and means, it will be able to make edits to websites and applications on the fly. Not only will it be able to make edits, but combined with machine learning, the Caas can now come up with recommendations from past data to make edits on its own. Efficiency-wise, it is cheaper to own a computer than it is to pay a human especially when a computer will work around the clock for you and never get tired. Imagine paying an extremely low price (one than you might already pay to get a website made) for getting your website built or maybe your small application created. Conclusion Every new technology comes with pros and cons. Overall, the number of software developers may decrease, or, as a developer, this may free up your time from more menial tasks, and enable you to further specialize and broaden your horizons. Artificial Intelligence programs such as Coding as a Service could be spent doing plenty of the underlying work, and leave some of the heavier loading to human programmers. With every new technology comes its positives and negatives. You just need to use the postives to your advantage!

0
0
10115

article-image-ibm-watson-transforming-healthcare

Kunal Parikh

29 Sep 2017

5 min read

How IBM Watson is paving the road for Healthcare 3.0

Kunal Parikh

29 Sep 2017

5 min read

[box type="shadow" align="" class="" width=""]Matt Kowalski (in Gravity): Houston, in the blind.[/box] Being an oncologist is a difficult job. Every year, 50,000 research papers are published on just Oncology. If an Oncologist were to read every one of them, it will take nearly 29 hours of reading every workday to stay updated on this plethora of information. Added to this is the challenge of dealing with nearly 1000 patients every year. Needless to say, a modern-day physician is bombarded with information that doubles every three years. This wide gap between the availability of information and the ability to access it in a manner that’s practically useful is simply getting wider. No wonder doctors and other medical practitioners can feel overwhelmed and lost in space, sometimes! [box type="shadow" align="" class="" width=""]Mission Control: Shariff, what's your status? Shariff: Nearly there.[/box] Advances in the field of Big Data and cognitive computing are helping make strides in solving this kind of pressing problems facing the healthcare industry. IBM Watson is at the forefront of solving such scenarios and as time goes by the system will only become more robust. From a strict technological standpoint, the new applications of Watson are impressive and groundbreaking: The system is capable of combing through 600,000 pieces of medical evidence, 2 million pages of text from 42 medical journals and clinical trials in the area of oncology research, and 1.5 million patient records to provide on-the-spot treatment recommendations to health care providers. According to IBM, more than 90 percent of the nurses who have worked with Watson follow the guidance the system gives to them. - Infoworld Watson, who? IBM Watson is an interactive expert system that uses cognitive computing, natural language processing, and evidence-based learning to arrive at answers to questions posed to it by its users in plain English. Watson doesn’t just stop with hypotheses generation but goes ahead and proposes a list of recommendations to the user. Let’s pause and try to grasp what this means for a healthcare professional. Imagine a doctor typing in his/her iPad “A cyst found in the under-arm of the patient and biopsy suggesting non-Hodgkin's Lymphoma”. With so many cancers and alternative treatments available to treat them, to zero down on the right cure at the right time is a tough job for an oncologist. IBM Watson taps into the collective wisdom of Oncology experts - practitioners, researchers, and academicians across the globe to understand the latest advances happening inside the rapidly evolving field of Oncology. It then culls out information most relevant to the patient’s particular situation after considering their medical history. Within minutes, Watson then comes up with various tailored approaches that the doctor can adopt to treat his/her patient. Watson can help healthcare professionals narrow down on the right diagnosis, take informed and timely decisions and put in place treatment plans for their patients. All the doctor has to do is ask a question while mentioning the symptoms a patient is experiencing. This question-answer format is pretty revolutionary in that it can completely reshape how healthcare exists. How is IBM Watson redefining Healthcare? As more and more information is fed into IBM Watson, doctors will get highly customised recommendations to treat their patients. The impact on patient care and hospital cost can be tremendous. For Healthcare professionals, Watson can Reduce/Eliminate time spent by healthcare professionals on insight mining from an ever-growing body of research Provide a list of recommended options for treatment with a score of confidence attached Design treatment plans based on option chosen In short, it can act as a highly effective personal assistant to this group. This means these professionals are more competent, more successful and have the time and energy to make deep personal connections with their patients thereby elevating patient care to a whole different level. For patients, Watson can Act an interactive interface answering their queries and connecting them with their healthcare professionals Provide at home diagnostics and healthcare advice Keep their patient records updated and synced up with their hospitals Thus, Watson can help patients make informed medical choices, take better care of themselves and alleviate the stress and anxiety induced by a chaotic and opaque hospital environment. For the healthcare industry, it means a reduction in overall cost to hospitals, reduced investment in post-treatment patient care, higher rates of success, reduction in errors due to oversight, misdiagnosis, and other human errors. This can indirectly improve key administrative metrics, lower employee burnout/churn rate, improve morale and result in other intangible benefits more. The implications of such a transformation are not limited to health care alone. What about Insurance, Watson? IBM Watson can have a substantial impact on insurance companies too. Insurance, a fiercely debated topic, is a major cost for healthcare. Increasing revenue potential, better customer relationship and reducing cost are some areas where Watson will start disrupting medical insurance. But that’s just the beginning. Tighter integration with hospitals, more data on patient care, and more information on newer remedies will provide ground-breaking insights to insurance companies. These insights will help them figure out the right premiums and the underwriting frameworks. Moreover, the above is not a scene set in some distant future. In Japan, insurance company Fukoku Mutual Life Insurance replaced 34 employees and deployed IBM Watson. Customers of Fukoku can now directly discuss with an AI robot instead of a human being to settle payments. Fukoku made a one-time fee of $1,70,000 along with yearly maintenance of $1,28,000 to IBM for its Watson’s services. They plan to recover this cost by replacing their team of sales personnel, insurance agents, and customer care personnel - potentially saving nearly a million dollars in annual savings. These are interesting times and some may even call it anxiety-inducing. [box type="shadow" align="" class="" width=""]Shariff: No, no, no, Houston, don't be anxious. Anxiety is bad for the heart.[/box]

0
0
2818

article-image-dl-wars-pytorch-vs-tensorflow

Savia Lobo

15 Sep 2017

6 min read

Is Facebook-backed PyTorch better than Google's TensorFlow?

Savia Lobo

15 Sep 2017

6 min read

[dropcap]T[/dropcap]he rapid rise of tools and techniques in Artificial Intelligence and Machine learning of late has been astounding. Deep Learning, or “Machine learning on steroids” as some say, is one area where data scientists and machine learning experts are spoilt for choice in terms of the libraries and frameworks available. There are two libraries that are starting to emerge as frontrunners. TensorFlow is the best in class, but PyTorch is a new entrant in the field that could compete. So, PyTorch vs TensorFlow, which one is better? How do the two deep learning libraries compare to one another? TensorFlow and PyTorch: the basics Google’s TensorFlow is a widely used machine learning and deep learning framework. Open sourced in 2015 and backed by a huge community of machine learning experts, TensorFlow has quickly grown to be THE framework of choice by many organizations for their machine learning and deep learning needs. PyTorch, on the other hand, a recently developed Python package by Facebook for training neural networks is adapted from the Lua-based deep learning library Torch. PyTorch is one of the few available DL frameworks that uses tape-based autograd system to allow building dynamic neural networks in a fast and flexible manner. Pytorch vs TensorFlow Let's get into the details - let the Python vs TensorFlow match up begin... What programming languages support PyTorch and TensorFlow? Although primarily written in C++ and CUDA, Tensorflow contains a Python API sitting over the core engine, making it easier for Pythonistas to use. Additional APIs for C++, Haskell, Java, Go, and Rust are also included which means developers can code in their preferred language. Although PyTorch is a Python package, there’s provision for you to code using the basic C/ C++ languages using the APIs provided. If you are comfortable using Lua programming language, you can code neural network models in PyTorch using the Torch API. How easy are PyTorch and TensorFlow to use? TensorFlow can be a bit complex to use if used as a standalone framework, and can pose some difficulty in training Deep Learning models. To reduce this complexity, one can use the Keras wrapper which sits on top of TensorFlow’s complex engine and simplifies the development and training of deep learning models. TensorFlow also supports Distributed training, which PyTorch currently doesn’t. Due to the inclusion of Python API, TensorFlow is also production-ready i.e., it can be used to train and deploy enterprise-level deep learning models. PyTorch was rewritten in Python due to the complexities of Torch. This makes PyTorch more native to developers. It has an easy to use framework that provides maximum flexibility and speed. It also allows quick changes within the code during training without hampering its performance. If you already have some experience with deep learning and have used Torch before, you will like PyTorch even more, because of its speed, efficiency, and ease of use. PyTorch includes custom-made GPU allocator, which makes deep learning models highly memory efficient. Due to this, training large deep learning models becomes easier. Hence, large organizations such as Facebook, Twitter, Salesforce, and many more are embracing Pytorch. In this PyTorch vs TensorFlow round, PyTorch wins out in terms of ease of use. Training Deep Learning models with PyTorch and TensorFlow Both TensorFlow and PyTorch are used to build and train Neural Network models. TensorFlow works on SCG (Static Computational Graph) that includes defining the graph statically before the model starts execution. However, once the execution starts the only way to tweak changes within the model is using tf.session and tf.placeholder tensors. PyTorch is well suited to train RNNs( Recursive Neural Networks) as they run faster in PyTorch than in TensorFlow. It works on DCG (Dynamic Computational Graph) and one can define and make changes within the model on the go. In a DCG, each block can be debugged separately, which makes training of neural networks easier. TensorFlow has recently come up with TensorFlow Fold, a library designed to create TensorFlow models that works on structured data. Like PyTorch, it implements the DCGs and gives massive computational speeds of up to 10x on CPU and more than 100x on GPU! With the help of Dynamic Batching, you can now implement deep learning models which vary in size as well as structure. Comparing GPU and CPU optimizations TensorFlow has faster compile times than PyTorch and provides flexibility for building real-world applications. It can run on literally any kind of processor from a CPU, GPU, TPU, mobile devices, to a Raspberry Pi (IoT Devices). PyTorch, on the other hand, includes Tensor computations which can speed up deep neural network models upto 50x or more using GPUs. These tensors can dwell on CPU or GPU. Both CPU and GPU are written as independent libraries; making PyTorch efficient to use, irrespective of the Neural Network size. Community Support TensorFlow is one of the most popular Deep Learning frameworks today, and with this comes a huge community support. It has great documentation, and an eloquent set of online tutorials. TensorFlow also includes numerous pre-trained models which are hosted and available on github. These models aid developers and researchers who are keen to work with TensorFlow with some ready-made material to save their time and efforts. PyTorch, on the other hand, has a relatively smaller community since it has been developed fairly recently. As compared to TensorFlow, the documentation isn’t that great, and codes are not readily available. However, PyTorch does allow individuals to share their pre-trained models with others. PyTorch and TensorFlow - A David & Goliath story As it stands, Tensorflow is clearly favoured and used more than PyTorch for a variety of reasons. Tensorflow best suited for a wide range of practical purposes. It is the obvious choice for many machine learning and deep learning experts because of its vast array of features. Its maturity in the market is important too. It has a better community support along with multiple language APIs available. It has a good documentation and is production-ready due to the availability of ready-to-use code. Hence, it is better suited for someone who wants to get started with Deep Learning, or for organizations wanting to productize their Deep Learning models. PyTorch is relatively new and has a smaller community than TensorFlow, but it is fast and efficient. In short, it gives you all the power of Torch wrapped in the usefulness and ease of Python. Because of its efficiency and speed, it's a good option for small, research based projects. As mentioned earlier, companies such as Facebook, Twitter, and many others are using Pytorch to train deep learning models. However, its adoption is yet to go mainstream. The potential is evident, PyTorch is just not ready yet to challenge the beast that is TensorFlow. However considering its growth, the day is not far when PyTorch is further optimized and offers more functionalities - to the point that it becomes the David to TensorFlow’s Goliath.

0
0
10547

Tech Guides - Data

"My Favorite Tools to Build a Blockchain App" - Ed, The Engineer

Top 4 chatbot development frameworks for developers

Introducing Intelligent Apps

AI chip wars: Is Brainwave Microsoft's Answer to Google's TPU?

What is Automated Machine Learning (AutoML)?

Neuroevolution: A step towards the Thinking Machine

DevOps might be the key to your Big Data project success

What's the difference between a data scientist and a data analyst

Beyond the Bitcoin: How cryptocurrency can make a difference in hurricane disaster relief

What we learned from Oracle OpenWorld 2017

Trending Topics

Say hello to Streaming Analytics

Top 5 misconceptions about data science

What is coding as a service?

How IBM Watson is paving the road for Healthcare 3.0

Is Facebook-backed PyTorch better than Google's TensorFlow?