Data | 0 articles | Tech News, Tutorials & Expert Insights

article-image-googles-new-what-if-tool-to-analyze-machine-learning-models-and-assess-fairness-without-any-coding-googles-pair-people-ai-research-team-has-come-out-with-a-new-tool-called

12 Sep 2018

3 min read

Google’s new What-if tool to analyze Machine Learning models and assess fairness without any coding

12 Sep 2018

Google’s PAIR ( People + AI Research ) team has come out with a new tool called “What-if”. It is a new feature in the open-source TensorBoard web application which allows users to analyze an ML model without the need of writing code. It also provides an interactive visual interface which lets you explore the model results. The “What-if” tool comes packed with two major features namely, Counterfactuals, and Performance and Algorithmic Fairness analysis. Let’s have a look at these two features. Counterfactuals What-if allows you to compare a datapoint to the most similar point where your model predicts a different result. These points are known as "counterfactuals”. It lets you edit a datapoint by hand and explore the prediction changes in a model’s a. In the figure below, the What-if tool is used on a binary classification model which predicts whether a person’s income is more than $50k depending on public census data from the UCI census dataset. Comparing counterfactuals This is a prediction task used by ML researchers when analyzing algorithmic fairness. So, here the model made a prediction that the person’s income is more than $50k for the selected datapoint. The tool then automatically locates the most-similar person in the dataset for which earnings of less than $50k had been predicted in the model and compares the two cases side-by-side. Performance and Algorithmic Fairness Analysis With the What-if tool, exploring the effects of distinct classification thresholds is also possible. The tool considers constraints such as different numerical fairness criteria. The figure below presents the results of a smile detector model which has been trained on the open-source CelebA dataset. The CelebA dataset comprises annotated face images of celebrities. Comparing the performance of two slices of data in a smile detection model In the figure above, the datasets have been divided by whether the people have brown hair. Each of the two groups in the figure has a ROC curve and a confusion matrix of the predictions. It also includes sliders for setting how confident the model must be before determining that a face is smiling. Here, the What-if tool automatically sets up the confidence thresholds for the two groups in order to optimize for equal opportunity. Apart from these major features, the What-if tool also explores features such as visualizing your dataset directly using Facets and manually editing examples from your dataset along with automatic generation of partial dependence plots ( shows how the model’s predictions change with any single feature changing). Additionally, the Google’s PAIR team released a set of demos using pre-trained models to illustrate the capabilities of the What-If Tool. Some of these demos include detecting misclassifications (A multiclass classification model), assessing fairness in binary classification models (image classification model), and investigating model performance across different subgroups (A regression model). “We look forward to people inside and outside of Google using this tool to better understand ML models and to begin assessing fairness,” says the PAIR team. For more information on What-if, be sure to check out the official Google AI blog. Dr. Fei Fei Li, Google’s AI Cloud head steps down amidst speculations; Dr. Andrew Moore to take her place Introducing Deon, a tool for data scientists to add an ethics checklist Google wants web developers to embrace AMP. Great news for users, more work for developers

0
0
3816

article-image-facebook-introduces-rosetta-a-scalable-ocr-system-that-understands-text-on-images-using-faster-rcnn-and-cnn

Bhagyashree R

12 Sep 2018

3 min read

Facebook introduces Rosetta, a scalable OCR system that understands text on images using Faster-RCNN and CNN

Bhagyashree R

12 Sep 2018

3 min read

Yesterday, researchers at Facebook introduced a machine learning system named, Rosetta for scalable optical character recognition (OCR). This model extracts text from more than a billion public Facebook and Instagram images and video frames. Then, this extracted text is fed into a text recognition model that has been trained on classifiers, which helps it understand the context of the text and the image together. Why Rosetta is introduced? Rosetta will help in the following scenarios: Provide a better user experience by giving users more relevant photo search results. Make Facebook more accessible for the visually impaired by incorporating the texts into screen readers. Help Facebook proactively identify inappropriate or harmful content. Help to improve the accuracy of classification of photos in News Feed to surface more personalized content. How it works? Rosetta consists of the following text extraction model: Source: Facebook Text extraction on an image is done in the following two steps: Text detection In this step, rectangular regions that potentially contain the text are detected. It performs text detection based on Faster R-CNN, a state-of-the-art object detection network. It uses Faster R-CNN but replaces ResNet convolutional body with a ShuffleNet-based architecture for efficiency reasons. The anchors in regional proposal network (RPN) are also modified to generate wider proposals, as text words are typically wider than the objects for which the RPN was designed. The whole detection system is trained jointly in a supervised, end-to-end manner. The model is bootstrapped with an in-house synthetic data set and then fine-tuned with human-annotated data sets so that it learns real-world characteristics. It is trained using the recently open-sourced Detectron framework powered by Caffe2. Text recognition The following image shows the architecture of the text recognition model: Source: Facebook In the second step, for each of the detected regions a convolutional neural network (CNN) is used to recognize and transcribe the word in the region. This model uses CNN based on the ResNet18 architecture, as this architecture is more accurate and computationally efficient. For training the model, finding what the text in an image says is considered as a sequence prediction problem. They input images containing the text to be recognized and the output generated is the sequence of characters in the word image. Treating the model as one of sequence prediction allows the system to recognize words of arbitrary length and to recognize the words that weren’t seen during training. This two-step model provides several benefits, including decoupling the training process of detection and recognition models, recognition of words in parallel, and independently supporting text recognition for different languages. Rosetta has been widely adopted by various products and teams within Facebook and Instagram. It offers a cloud API for text extraction from images and processes a large volume of images uploaded to Facebook every day. In future, the team is planning to extend this system to extract text from videos more efficiently and also support a wide number of languages used on Facebook. To get a more in-depth idea of how Rosetta works, check out the researchers’ post at Facebook code blog and also read this paper: Rosetta: Large Scale System for Text Detection and Recognition in Images. Why learn machine learning as a non-techie? Is the machine learning process similar to how humans learn? Facebook launches a 6-part Machine Learning video series

0
0
7315

article-image-darpas-2-billion-ai-next-campaign-includes-a-next-generation-nonsurgical-neurotechnology-n3-program

Savia Lobo

11 Sep 2018

3 min read

DARPA’s $2 Billion ‘AI Next’ campaign includes a Next-Generation Nonsurgical Neurotechnology (N3) program

Savia Lobo

11 Sep 2018

3 min read

Last Friday (7th September, 2018), DARPA announced a multi-year investment of more than $2 billion in a new program called the ‘AI Next’ campaign. DARPA’s Agency director, Dr. Steven Walker, officially unveiled the large-scale effort during D60, DARPA’s 60th Anniversary Symposium held in Maryland. This campaign seeks contextual reasoning in AI systems in order to create deeper trust and collaborative partnerships between humans and machines. The key areas the AI Next Campaign may include are: Automating critical DoD (Department of Defense) business processes, such as security clearance vetting in a week or accrediting software systems in one day for operational deployment. Improving the robustness and reliability of AI systems; enhancing the security and resiliency of machine learning and AI technologies. Reducing power, data, and performance inefficiencies. Pioneering the next generation of AI algorithms and applications, such as ‘explainability’ and commonsense reasoning. The Next-Generation Nonsurgical Neurotechnology (N3) program In the conference, DARPA officials also described the next frontier of neuroscience research: technologies for able-bodied soldiers that give them super abilities. Following this, they introduced the Next-Generation Nonsurgical Neurotechnology (N3) program, which was announced in March. This program aims at funding research on tech that can transmit high-fidelity signals between the brain and some external machine without requiring that the user is cut open for rewiring or implantation. Al Emondi, manager of N3, said to IEEE Spectrum that he is currently picking researchers who will be funded under the program and can expect an announcement in early 2019. The program has two tracks: Completely non-invasive: The N3 program aims for new non-invasive tech that can match the high performance currently achieved only with implanted electrodes that are nestled in the brain tissue and therefore have a direct interface with neurons—either recording the electrical signals when the neurons “fire” into action or stimulating them to cause that firing. Minutely invasive: DARPA says it doesn’t want its new brain tech to require even a tiny incision. Instead, minutely invasive tech might come into the body in the form of an injection, a pill, or even a nasal spray. Emondi imagines “nanotransducers” that can sit inside neurons, converting the electrical signal when it fires into some other type of signal that can be picked up through the skull. Justin Sanchez, director of DARPA’s Biological Technologies Office, said that making brain tech easy to use will open the floodgates. He added, “We can imagine a future of how this tech will be used. But this will let millions of people imagine their own futures”. To know more about the AI Next Campaign and the N3 program in detail, visit DARPA blog. Skepticism welcomes Germany’s DARPA-like cybersecurity agency – The federal agency tasked with creating cutting-edge defense technology DARPA on the hunt to catch deepfakes with its AI forensic tools underway

0
0
4836

article-image-tensorflow-data-validation-tfdv-automates-and-scales-data-analysis-validation-and-monitoring

Bhagyashree R

11 Sep 2018

2 min read

TensorFlow announces TensorFlow Data Validation (TFDV) to automate and scale data analysis, validation, and monitoring

Bhagyashree R

11 Sep 2018

2 min read

Today the TensorFlow team announced the launch of TensorFlow Data Validation (TFDV), an open-source library that enables developers to understand, validate, and monitor their machine learning data at scale. Why is TensorFlow Data Validation introduced? While building machine learning algorithms a lot of attention is paid on improving their performance. However, if the input data is wrong, all this optimization effort goes to waste. Understanding and validating small amount of data is easy, you can do it manually as well. However, in the real-world this is not the case. Data in production is huge and often arrives continuously and in big chunks. This is why, it is necessary to automate and scale the tasks of data analysis, validation, and monitoring. What are some features of TFDV? TFDV is part of the TensorFlow Extended (TFX) platform, a TensorFlow-based general-purpose machine learning platform. It is already being used by Google every day to analyze and validate petabytes of data. TFDV provides some of the following features: It can compute descriptive statistics that provide a quick overview of the data in terms of the features that are present and the shapes of their value distributions. It includes tools such as Facets Overview, which provides a visualization of the computed statistics for easy browsing. Data-schema can be generated automatically to describe expectations about data such as required values, ranges, and vocabularies. Since writing a schema can be a tedious task for datasets with lots of features, TFDV provides a method to generate an initial version of the schema based on the descriptive statistics. You can inspect the schema with the help of schema viewer. You can identify anomalies such as missing features, out-of-range values, or wrong feature types with Anomaly detection. Provides an anomalies viewer so that you can see what features have anomalies and learn more in order to correct them. To learn more on how it is used in production, read the official announcement by TensorFlow on Medium and also check out TFDV’s GitHub repository. Why TensorFlow always tops machine learning and artificial intelligence tool surveys TensorFlow 2.0 is coming. Here’s what we can expect. Can a production ready Pytorch 1.0 give TensorFlow a tough time?

0
0
3739

article-image-introducing-jupytext-jupyter-notebooks-as-markdown-documents-julia-python-or-r-scripts

Natasha Mathur

11 Sep 2018

2 min read

Introducing Jupytext: Jupyter notebooks as Markdown documents, Julia, Python or R scripts

Natasha Mathur

11 Sep 2018

2 min read

Project Jupyter released Jupytext, last week, a new project which allows you to convert Jupyter notebooks to and from Julia, Python or R scripts (extensions .jl, .py and .R), markdown documents (extension .md), or R Markdown documents (extension .Rmd). It comes with features such as writing notebooks as plain text, paired notebooks, command line conversion, and round-trip conversion. It is available from within Jupyter. It allows you to work as you would usually do on your notebook in Jupyter, and save and read it in the formats you select. Let’s have a look at its major features. Writing notebooks as plain text Jupytext allows plain scripts that you can draft and test in your favorite IDE and open naturally as notebooks in Jupyter. You can run the notebook in Jupyter for generating output, associating a .ipynb representation, along with saving and sharing your research. Paired Notebooks Paired notebooks let you store a .ipynb file alongside the text-only version. Paired notebooks can be enabled by adding a jupytext_formats entry to the notebook metadata with Edit/Edit Notebook Metadata in Jupyter's menu. On saving the notebook, both the Jupyter notebook and the python scripts are updated. Command line conversion There’s a jupytext script present for command line conversion between the various notebook extensions: jupytext notebook.ipynb --to md --test (Test round-trip conversion) jupytext notebook.ipynb --to md --output (display the markdown version on screen) jupytext notebook.ipynb --to markdown (create a notebook.md file) jupytext notebook.ipynb --to python (create a notebook.py file) jupytext notebook.md --to notebook (overwrite notebook.ipynb) (remove outputs) Round-trip conversion Round-trip conversion is also possible with Jupytext. Converting Script to Jupyter notebook to script again is identity, meaning that on associating a Jupyter kernel with your notebook, the information will go to a yaml header at the top of your script. Converting Markdown to Jupyter notebook to Markdown is again identity. Converting Jupyter to script, then back to Jupyter preserves source and metadata. Similarly, converting Jupyter to Markdown, and Jupyter again preserves source and metadata (cell metadata available only for R Markdown). For more information on, check out the official release notes. 10 reasons why data scientists love Jupyter notebooks Is JupyterLab all set to phase out Jupyter Notebooks? How everyone at Netflix uses Jupyter notebooks from data scientists, machine learning engineers, to data analysts

0
0
4618

article-image-watermelon-db-a-new-relational-database-to-make-your-react-and-react-native-apps-highly-scalable

Bhagyashree R

11 Sep 2018

2 min read

Introducing Watermelon DB: A new relational database to make your React and React Native apps highly scalable

Bhagyashree R

11 Sep 2018

2 min read

Now you can store your data in Watermelon! Yesterday, Nozbe released Watermelon DB v0.6.1-1, a new addition to the database world. It aims to help you build powerful React and React Native apps that scale to large number of records and remain fast. Watermelon architecture is database-agnostic, making it cross-platform. It is a high-level layer for dealing with data, but can be plugged in to any underlying database, depending on platform needs. Why choose Watermelon DB? Watermelon DB is optimized for building React and React Native complex applications. Following are the factors that help in ensuring high speed of applications: It makes your application highly scalable by using lazy loading, which means Watermelon DB loads data only when it is requested. Most queries resolve in less than 1ms, even with 10,000 records, as all querying is done on SQLite database on a separate thread. You can launch your app instantly irrespective of how much data you have. It is supported on iOS, Android, and the web. It is statically typed keeping Flow, a static type checker for JavaScript, in mind. It is fast, asynchronous, multi-threaded, and highly cached. It is designed to be used with a synchronization engine to keep the local database up to date with a remote database. Currently, Watermelon DB is in active development and cannot be used in production. Their roadmap states that, migrations will soon be added to allow the production use of Watermelon DB. Schema migrations is the mechanism by which you can add new tables and columns to the database in a backward-compatible way. To know how you can install it and to try few examples, check out Watermelon DB on GitHub. React Native 0.57 coming soon with new iOS WebViews What’s in the upcoming SQLite 3.25.0 release: windows functions, better query optimizer and more React 16.5.0 is now out with a new package for scheduling, support for DevTools, and more!

0
0
8905

article-image-rigetti-computing-quantum-cloud-services-bring-quantum-computing-businesses

Sugandha Lahoti

11 Sep 2018

3 min read

Rigetti Computing launches the first Quantum Cloud Services to bring quantum computing to businesses

Sugandha Lahoti

11 Sep 2018

3 min read

Rigetti Computing have launched Quantum Cloud Services, bringing together the best of classical and quantum computing on a single cloud platform. “What this platform achieves for the very first time is an integrated computing system that is the first quantum cloud services architecture,” says Chad Rigetti, founder and CEO. Rigetti Computing has been competing head to head with behemoths like Google and IBM to grab the quantum computing market. Last month, Rigetti unveiled plans to deploy 128 qubit chip quantum computing system, challenging Google, IBM, and Intel for leadership in this emerging technology. Prior to that, last year, in December, Rigetti developed a new quantum algorithm to supercharge unsupervised Machine Learning. Now the startup says, “the first Quantum computing is almost ready for business.” With QCS you can build and run programs combining real quantum hardware in a virtual development environment. Quantum Cloud Services will be used to address fundamental challenges in medicine, energy, business, and science. Quantum cloud Services will offer a combination of a cloud-based classical computer, its Forest development platform and access to Rigetti’s quantum backends. Chemistry: QCS can be used for predicting the properties of complex molecules and materials to design more effective medicines, energy technologies and resilient crops. Machine Learning: QCS can be used for training advanced AI on quantum computers. These will help in computer vision, pattern recognition, voice recognition and machine translation. Optimization: QCS can solve complex optimizations such as ‘job shop’ scheduling and traveling salesperson problems. This will drive critical efficiencies in businesses, military and public sector logistics, scheduling, shipping and resource allocation. Rigetti is now inviting customers to apply for free access to these systems. They have invited developers to build a real-world application that achieves quantum advantage and the first to make it wins a $1 million prize. “What we want to do is focus on the commercial utility and applicability of these machines, because ultimately that’s why this company exists,” says Rigetti. Rigetti is also partnering with a number of leading quantum computing startups including Entropica Labs, Horizon Quantum Computing, OTI Lumionics, ProteinQure, QC Ware and Riverlane Research. They have collaborated with Rigetti to build and distribute the applications through the Rigetti QCS platform. You can read more details on the Rigetti Computing official website. Quantum Computing is poised to take a quantum leap with industries and governments on its side. Did quantum computing just take a quantum leap? A two-qubit chip by UK researchers makes controlled quantum entanglements possible. Rigetti plans to deploy 128 qubit chip Quantum computer.

0
0
2908

article-image-like-newspapers-google-algorithms-are-protected-by-the-first-amendment

Savia Lobo

10 Sep 2018

4 min read

Like newspapers, Google algorithms are protected by the First amendment making them hard to legally regulate them

Savia Lobo

10 Sep 2018

4 min read

At the end of last month, Google denied U.S President Donald Trump’s accusatory tweet which said it’s algorithms favor liberal media outlets over right-wing ones. Trump’s accusations hinted at Google regulating the information that comes up in Google searches. However, governing or regulating algorithms and the decisions they make about which information should be provided and prioritized is a bit tricky. Eugene Volokh, a University of California-Los Angeles law professor and author of a 2012 white paper on the constitutional First Amendment protection of search engines, said, “Each search engine’s editorial judgment is much like many other familiar editorial judgments.” A similar scenario of a newspaper case from 1974 sheds light on what the government can control under the First Amendment, companies’ algorithms and how they produce and organize information. On similar lines, Google too has the right to protect its algorithms from being regulated by the law. Google has the right to protect algorithms, based on a 1974 case According to Miami Herald v. Tornillo 1974 case, the Supreme Court struck down a Florida law that gave political candidates the “right of reply” to criticisms they faced in newspapers. The law required the newspaper to publish a response from the candidate, and to place it, free of charge, in a conspicuous place. The candidate’s lawyers contended that newspapers held near monopolistic roles when it came to reaching audiences and that compelling them to publish responses was the only way to ensure that candidates could have a comparable voice. The 1974 case appears similar to the current scenario. Also, if Google’s algorithms are manipulated, those who are harmed will have comparatively limited tools through which to be heard. Back then, Herald refused to comply with the law. Its editors argued that the law violated the First Amendment because it allowed the government to compel a newspaper to publish certain information. The Supreme Court too agreed with the Herald and the Justices explained that the government cannot force newspaper editors “to publish that which reason tells them should not be published.” Why Google cannot be regulated by law Similar to the 1974 case, Justices used the decision to highlight that the government cannot compel expression. They also emphasized that the information selected by editors for their audiences is part of a process and that the government has no role in that process. The court wrote, “The choice of material to go into a newspaper and the decisions as to limitations on size and content of the paper, and treatment of public issues and public officials—fair or unfair—constitute the exercise of editorial control and judgment.” According to two federal court decisions, Google is not a newspaper and algorithms are not human editors. Thus, a search engine or social media company’s algorithm-based content decisions should not be protected in similar ways as those made by newspaper editors. The judge explained, “Here, the process, which involves the . . . algorithm, is objective in nature. In contrast, the result, which is the PageRank—or the numerical representation of relative significance of a particular website—is fundamentally subjective in nature.” Ultimately, the judge compared Google’s algorithms to the types of judgments that credit-rating companies make. These firms have a right to develop their own processes and to communicate the outcomes. Comparison of both journalistic protections and algorithms, was conducted in a Supreme Court’s ruling in Citizens United v. FEC in 2010. The case focused on the parts of the Bipartisan Campaign Reform Act that limited certain types of corporate donations during elections. Citizens United, which challenged the law, is a political action committee. Chief Justice John Roberts explained that the law, because of its limits on corporate spending, could allow the government to halt newspapers from publishing certain information simply because they are owned by corporations. This can also harm public discourse. Any attempt to regulate Google’s and other corporations’ algorithmic outputs would have to overcome: The hurdles the Supreme Court put in place in the Herald case regarding compelled speech and editorial decision-making, The Citizens United precedent that corporate speech, which would also include a company’s algorithms, is protected by the First Amendment. Read more about this news in detail on Columbia Journalism Review. Google slams Trump’s accusations, asserts its search engine algorithms do not favor any political ideology North Korean hacker charged for WannaCry ransomware and for infiltrating Sony Pictures Entertainment California’s tough net neutrality bill passes state assembly vote

0
0
3738

article-image-a-new-video-to-video-synthesis-model-uses-artificial-intelligence-to-create-photorealistic-videos

Natasha Mathur

10 Sep 2018

4 min read

A new Video-to-Video Synthesis model uses Artificial Intelligence to create photorealistic videos

Natasha Mathur

10 Sep 2018

4 min read

A paper titled “Video-to-Video Synthesis”, introduces a new model using the generative adversarial learning framework. This model is capable of performing video to video synthesis to achieve high-resolution, photorealistic, and temporally coherent video results on a diverse set of inputs. These inputs include segmentation masks, sketches, and poses. https://www.youtube.com/watch?v=GRQuRcpf5Gc Video-to-video synthesis What problem is the paper trying to solve? The paper focuses on a mapping function which can effectively convert an input video to an output video. Although image-to-image translation methods are quite popular, a general-purpose solution for video-to-video synthesis has not yet been explored. The paper considers the video-to-video synthesis problem as a distribution matching problem. This involves training a model in such a way that conditional distribution of the synthesized videos makes sure that the input videos resembles that of real videos. Given a set of aligned input and output videos, the model maps the input videos to the output domain at the test time. This approach is also capable of generating photorealistic 2K resolution videos which can be up to 30 seconds long. How does the model work? The network is trained in a spatio-temporally progressive manner. “We start with generating low-resolution and few frames, and all the way up to generating full resolution and 30 (or more) frames. Our coarse-to-fine generator consists of three scales, which operates on 512 × 256, 1024 × 512, and 2048 × 1024 resolutions, respectively” reads the paper. The model is trained for 40 epochs and uses the ADAM [36] optimizer with lr = 0.0002 and (β1, β2) = (0.5, 0.999) on an NVIDIA DGX1 machine. All the GPUs in DGX1 (8 V100 GPUs, each with 16GB memory) are used for training purposes. A generator computation task is distributed to 4 GPUs and the discriminator computation task is distributed to the other 4 GPUs. Training the model takes somewhere around 10 days for 2K resolution. There are several datasets which are used for training the model such as Cityscapes, Apollo Scape, Face video dataset, FaceForensics dataset, and Dance video dataset. Apart from this, the researchers compared the approach to two baselines trained on the same data, namely, pix2pixHD ( the state-of-the-art image-to-image translation approach) and COVST. For evaluating the model’s performance, both subjective and objective metrics are used. First is the Human preference score that performs a human subjective test for evaluation of the visual quality of synthesized videos. Second is the Fréchet Inception Distance (FID), a widely used metric for implicit generative models. Limitations of the model This model fails in situations when synthesizing turning cars because of insufficient information in label maps. This can be addressed by adding 3D cues, such as depth maps. Also, the model doesn’t guarantee that an object will have a consistent appearance across the whole video. This means that there can be instances where a car may change its color gradually. Lastly, by performing semantic manipulations such as turning trees into buildings, visible artifacts may appear i.e. building and trees can have different label shapes. However, this can be resolved by using coarser semantic labels to train the model since that would make it less sensitive to label shapes. “Extensive experiments demonstrate that our results are significantly better than the results by state-of-the-art methods. Its extension to the future video prediction task also compares favorably against the competing approaches” reads the paper. The paper has received public criticism from a few over the concern that it can be used to create deepfakes or tampered videos which can deceive people for illegal and exploitation purposes. While others view it as a great step into the AI-driven future. For more information, be sure to check out the official research paper. This self-driving car can drive in its imagination using deep reinforcement learning Introducing Deon, a tool for data scientists to add an ethics checklist Baidu releases EZDL – a platform that lets you build AI and machine learning models without any coding knowledge

0
0
4130

article-image-openzeppelin-2-0-rc-1-framework-for-writing-ethereum-secure-smart-contracts-is-out

Bhagyashree R

10 Sep 2018

3 min read

OpenZeppelin 2.0 RC 1, a framework for writing secure smart contracts on Ethereum, is out!

Bhagyashree R

10 Sep 2018

3 min read

After concluding the release cycle of version 1.0 last month, OpenZeppelin marked the start of another release cycle by launching OpenZeppelin 2.0 RC 1 on September 7th. This release is aimed to deliver reliable updates to their users, unlike some of the earlier releases, which were backwards-incompatible. OpenZeppelin is a framework of reusable smart contracts for Ethereum and other EVM and eWASM blockchains. You can build distributed applications, protocols, and organizations in Solidity language. What's new in OpenZeppelin 2.0 RC 1? Changes This release provides a more granular system of roles, like the MinterRole. Similar to Ownable, the creator of a contract is assigned all roles at first, but they can selectively give them out to other accounts. Ownable contracts is now moved to role based access. To increase encapsulation, all state variables are now private. This means that derived contracts cannot directly access the state variables, but have to use getters. All event names have been changed to be consistently in the past tense except those which are defined by an ERC. ERC721 is now separated into different optional interfaces - Enumerable and Metadata. ERC721Full has both the extensions. In SafeMath, require is used instead of assert. The ERC721.exists function is now internal. Earlier, SplitPayment allowed deployment of an instance with no payees. This will cause every single call to claim to revert, making all Ether sent to it lost forever. The preconditions on SplitPayment constructor arguments are now changed to prevent this scenario. The IndividuallyCappedCrowdsale interface is simplified by removing the concept of user from the crowdsale flavor. The setGroupCap function, which takes an array is also removed, as this is not customary across the OpenZeppelin API. ERC contracts have all been renamed to follow the same convention. The interfaces are called IERC##, and their implementations are ERC##. ERC20.decreaseApproval is renamed to decreaseAllowance, and its semantics are also changed to be more secure. MerkleProof.verifyProof is renamed to MerkleProof.verify. ECRecovery is renamed to to ECDSA, and AddressUtils to Address. Additions ERC165Query is added to query support for ERC165 interfaces. A new experimental contract is added to migrate ERC20 tokens with an opt-in strategy. A modulo operation, SafeMath.mod is added to get the quotient. Added Math.average. Added ERC721Pausable. Removed Restriction on who can release funds in PullPayments, SplitPayment, PostDeliveryCrowdsale, RefundableCrowdsale is removed. ERC20Basic is removed, now there's only ERC20. The Math.min64 and Math.max64 functions are now removed, left only the uint256 variants. The Mint and Burn events are removed from ERC20Mintable and ERC20Burnable. A few contracts that were not generally secure enough are removed: LimitBalance, HasNoEther, HasNoTokens, HasNoContracts, NoOwner, Destructible, TokenDestructible, CanReclaimToken. You can install the release candidate by running the npm install openzeppelin-solidity@next command. To read more about OpenZeppelin 2.0 RC 1, head over to OpenZeppelin’s GitHub repository. The trouble with Smart Contracts Ethereum Blockchain dataset now available in BigQuery for smart contract analytics How to set up an Ethereum development environment [Tutorial]

0
0
2712

article-image-ibm-developed-nypd-surveillance-tools-cops-pick-targets-based-on-skin-color

Fatema Patrawala

07 Sep 2018

3 min read

The Intercept says IBM developed NYPD surveillance tools that let cops pick targets based on skin color

Fatema Patrawala

07 Sep 2018

3 min read

The NYPD's secretive Lower Manhattan Security Coordination Center uses software from IBM in its video analytics system as per the Intercept’s report. This technology developed by IBM will allow cops to automatically scan surveillance footage for machine-generated labels that identify clothing and other identifying classifiers. Recent confidential corporate documents from IBM provide real-time insight into how this system was developed and used. Since at least 2012 and until at least 2016, IBM's video classification tool has allowed NYPD officers and contractors to use skin color as a classifier for identifying suspects; the training data for this feature came from the NYPD's own footage. The Intercept and the Investigative Fund have learned that IBM began developing this object identification technology using secret access to NYPD camera footage. With access to images of thousands of unknowing New Yorkers offered up by NYPD officials, IBM was creating new search features that allow other police departments to search camera footage for images of people by hair color, facial hair, and skin tone. More recent versions of IBM's tools have "ethnicity" search boxes that allow users to search on terms like "white," "black" and "Asian." In an email to The Intercept, the NYPD confirmed that select counterterrorism officials had access to a pre-released version of IBM’s program, which included skin tone search capabilities, as early as the summer of 2012. NYPD spokesperson Peter Donald says, “The search characteristics were only used for evaluation purposes and that officers were instructed not to include the skin tone search feature in their assessment. The department eventually decided not to integrate the analytics program into its larger surveillance architecture, and phased out the IBM program in 2016.” The NYPD has been notorious for decades' worth of mass-scale racial profiling scandals, ranging from stop-and-frisk to public executions of black people. Though they mention that IBM personnel had access to certain cameras for the sole purpose of configuring NYPD’s system. And the department had safeguards in place to protect the data, including non-disclosure agreements for each individual accessing the system, for the companies and the vendors it worked for. Civil liberties advocates contend that New Yorkers should have been made aware of the potential use of their physical data for a private company’s development of surveillance technology. They want NYPD to be transparent about surveillance acquisitions and adhere to the New York City surveillance bill. Say hello to IBM RXN, a free AI Tool in IBM Cloud for predicting chemical reactions Stack skills, not degrees: Industry-leading companies, Google, IBM, Apple no longer require degrees IBM Files Patent for “Managing a Database Management System using a Blockchain Database”

0
0
2070

article-image-google-launches-a-dataset-search-engine-for-finding-datasets

Sugandha Lahoti

07 Sep 2018

2 min read

Google launches a Dataset Search Engine for finding Datasets on the Internet

Sugandha Lahoti

07 Sep 2018

2 min read

Google has launched Dataset Search, a search engine for finding datasets on the internet. This search engine will be a companion of sorts to Google Scholar, the company’s popular search engine for academic studies and reports. Google Dataset Search will allow users to search through datasets across thousands of repositories on the Web whether it be on a publisher's site, a digital library, or an author's personal web page. Google’s Dataset Search scrapes government databases, public sources, digital libraries, and personal websites to track down the datasets. It also supports multiple languages and will add support for even more soon. The initial release of Dataset Search will cover the environmental and social sciences, government data, and datasets from news organizations like ProPublica. It may soon expand to include more sources. Google has developed certain guidelines for dataset providers to describe their data in a way that Google can better understand the content of their pages. Anybody who publishes data structured using schema.org markup or similar equivalents described by the W3C, will be traversed by this search engine. Google also mentioned that Data Search will improve as long as data publishers are willing to provide good metadata. If publishers use the open standards to describe their data, more users will find the data that they are looking for. Natasha Noy, a research scientist at Google AI who helped create Dataset Search, says that “the aim is to unify the tens of thousands of different repositories for datasets online. We want to make that data discoverable, but keep it where it is.” Ed Kearns, Chief Data Officer at NOAA, is a strong supporter of this project and helped NOAA make many of their datasets searchable in this tool. “This type of search has long been the dream for many researchers in the open data and science communities” he said. Try out Google’s new Dataset Search here. 25 Datasets for Deep Learning in IoT. Datasets and deep learning methodologies to extend image-based applications to videos. Google-Landmarks, a novel dataset for instance-level image recognition.

0
0
4582

article-image-cstar-spotifys-cassandra-orchestration-tool-is-now-open-source

Melisha Dsouza

07 Sep 2018

4 min read

cstar: Spotify’s Cassandra orchestration tool is now open source!

Melisha Dsouza

07 Sep 2018

4 min read

On the 4th of September 2018, Spotify labs announced that cstar- the Cassandra orchestration tool for the command line, will be made freely available to the public. In Cassandra, it is complicated to understand how to achieve the perfect performance, security, and data consistency. You need to run a specific set of shell commands on every node of a cluster, usually in some coordination to avoid the cluster being down. This task can be easy for small clusters, but can get tricky and time consuming for the big clusters. Imagine having to run those commands on all Cassandra nodes in the company! It would be time consuming and labor intensive. A scheduled upgrade of the entire Cassandra fleet at Spotify included a precise procedure that involved numerous steps. Since Spotify has clusters with hundreds of nodes, upgrading one node at a time is unrealistic. Upgrading all nodes at once also wasn't a probable option, since that would take down the whole cluster. In addition to the outlined performance problems, other complications while dealing with Cassandra involved: Temporary network failures, breaking SSH connections, among others Performance and availability can be affected if operations that are computation heavy or involve restarting the Cassandra process/node are not executed in a particular order Nodes can go down at any time, so the status of the cluster should be checked not just before running the task, but also before execution is started on a new node. This means there is no scope of parallelization. Spotify was in dire need of an efficient and robust method to counteract these performance issues on thousands of computers in a coordinated manner. Why was Ansible or Fabric not considered by Spotify? Ansible and Fabric are not topology-aware. They can be made to run commands in parallel on groups of machines. Some wrapper scripts and elbow grease, can help split a Cassandra cluster into multiple groups, and execute a script on all machines in one group in parallel. But on the downside, this solution doesn’t wait for Cassandra nodes to come back up before proceeding nor does it notice if random Cassandra nodes go down during execution. Enter cstar cstar is based on paramiko-a Python (2.7, 3.4+) implementation of the SSHv2 protocol, and shares the same ssh/scp implementation that Fabric uses. Being a command line tool, it runs an arbitrary script on all hosts in a Cassandra cluster in “topology aware” fashion. Example of cstar running on a 9 node cluster with replication factor of 3, with the assumption that the script brings down the Cassandra process. Notice how there are always 2 available replicas for each token range. Source: Spotify Labs cstar supports the following execution mechanisms: The script is run on exactly one node per data center at the time. If you have N data centers with M nodes each and replication factor of X, this effectively runs the script on M/X * N nodes at that time. The script run on all nodes at the same time, regardless of the topology. Installing cstar and running a command on a cluster is easy and can be done by following this quick example Source: Spotify Labs The concept of ‘Jobs’ Execution of a script on one or more clusters is a job. Job control in cstar works like in Unix shells. A user can pause running jobs and then resume them at a later point in time. It is also possible to configure cstar to pause a job after a certain number of nodes have completed. This helps users to: Run a cstar job on one node Manually validate if the job worked as expected Lastly, the user can resume the job. The features of Cstar has made it really easy for Spotify to work with Cassandra clusters. You can find more insights to this article on Spotify Labs. Mozilla releases Firefox 62.0 with better scrolling on Android, a dark theme on macOS, and more Baidu releases EZDL – a platform that lets you build AI and machine learning models without any coding knowledge PrimeTek releases PrimeReact 2.0.0 Beta 3 version

0
0
2521

article-image-onnx-1-3-is-here-with-experimental-function-concept

Natasha Mathur

06 Sep 2018

2 min read

ONNX 1.3 is here with experimental function concept

Natasha Mathur

06 Sep 2018

2 min read

Open Neural Network Exchange (ONNX) team released ONNX 1.3, last week. The latest release includes features such as experimental function concept, along with other related improvements. ONNX is an open ecosystem that allows Artificial Intelligence developers to select the right set of tools as their project evolves. ONNX provides an open source format for the deep learning models which allows machines to learn tasks without the need of being explicitly programmed. Deep learning models trained on one framework can easily be transferred to another with the help of the ONNX format. Let’s explore the changes in ONNX 1.3. ONNX 1.3 Key Updates The control flow operators in Operator Set 8 in ONNX 1.3 have evolved from the experimental phase. A new operator Expand has been added. Other operators such as Max, Min, Mean, and Sum have been added to support broadcasting. Other than that, there is added support for output indices in operator MaxPool. An experimental function concept is introduced in ONNX 1.3 for representing composed operators. MeanVarianceNormalization uses this feature. Shape inference in ONNX 1.3 has been enhanced with support added for Reshape operator with a constant new shape. There are more ONNX optimization passes available. In addition to that, there are more operator backend tests available now with newly added test coverage stat page. Opset Version Converter provides support for operators such as Add, Mul, Gemm, Relu, BatchNorm, Concat, Reshape, Sum, MaxPool, AveragePool, and Dropout. All the models in the model zoo have been covered, except tiny-yolo-v2. For more information, check out the official ONNX 1.3 release notes. Amazon, Facebook and Microsoft announce the general availability of ONNX v0.1 ONNX for MXNet: Interoperability across deep learning models made easy Baidu announces ClariNet, a neural network for text-to-speech synthesis

0
0
1744

article-image-this-self-driving-car-can-drive-in-its-imagination-using-deep-reinforcement-learning

Natasha Mathur

06 Sep 2018

3 min read

This self-driving car can drive in its imagination using deep reinforcement learning

Natasha Mathur

06 Sep 2018

3 min read

Wayve, a new U.K. self-driving car startup, trained a car to drive in its imagination using a model-based deep reinforcement learning system. This system helps the prediction model to learn from real-world data collected offline. The car observes the motion of other agents in the scene, predicts their direction, thereby, making an informed driving decision. Dreaming to drive The deep reinforcement learning system was trained using data collected during sunny weather in Cambridge, UK. The training process used World Models (Ha & Schmidhuber, 2018), with monocular camera input on an autonomous vehicle. Although the system has been trained for the sunny weather, it can still successfully drive in the rain. It does not get distracted by the reflections produced by puddles or the droplets of water on the camera lens. Dreaming to drive in the rain The underlying training process Firstly, the prediction model was trained on the collected data. A variational autoencoder was used to encode the images into a low dimensional state. After this, a probabilistic recurrent neural network was trained to develop a prediction model. This helps estimate the next probabilistic state based on the current state and action. Also, an encoder and prediction model is trained using the real-world data. Once that is done, a driving policy is initialized and its performance is assessed using the prediction model in simulated experiences. Similarly, many simulated sequences can be trained, by imagining experiences. These imagined sequences can also be visualized to observe the learned policy. “Using a prediction model, we can dream to drive on a massively parallel server, independent of the robotic vehicle. Furthermore, traditional simulation approaches require people to hand-engineer individual situations to cover a wide variety of driving scenarios. Learning a prediction model from data automates the process of scenario generation, taking the human engineer out of the loop” reads the Wayve blog post. Generally, there are differences in appearance and behavior between simulator solutions and the real world, making it challenging to leverage knowledge acquired in the simulation. Wayve's deep reinforcement learning system does not have this limitation as the system is trained directly using the real-world data. Hence, there is no major difference between the simulation and the real world. Finally, as the learned simulator is differentiable, it is easy to directly optimize a driving policy using gradient descent. “Wayve is committed to developing richer and more robust temporal prediction models and believe this is key to building intelligent and safe autonomous vehicles,” says the Wayve team. For more information, check out the official Wayve blog post. What we learned from CES 2018: Self-driving cars and AI chips are the rage! Tesla is building its own AI hardware for self-driving cars MIT’s Duckietown Kickstarter project aims to make learning how to program self-driving cars affordable

0
0
3512

Tech News - Data

Google’s new What-if tool to analyze Machine Learning models and assess fairness without any coding

Facebook introduces Rosetta, a scalable OCR system that understands text on images using Faster-RCNN and CNN

DARPA’s $2 Billion ‘AI Next’ campaign includes a Next-Generation Nonsurgical Neurotechnology (N3) program

TensorFlow announces TensorFlow Data Validation (TFDV) to automate and scale data analysis, validation, and monitoring

Introducing Jupytext: Jupyter notebooks as Markdown documents, Julia, Python or R scripts

Introducing Watermelon DB: A new relational database to make your React and React Native apps highly scalable

Rigetti Computing launches the first Quantum Cloud Services to bring quantum computing to businesses

Like newspapers, Google algorithms are protected by the First amendment making them hard to legally regulate them

A new Video-to-Video Synthesis model uses Artificial Intelligence to create photorealistic videos

OpenZeppelin 2.0 RC 1, a framework for writing secure smart contracts on Ethereum, is out!

Trending Topics

The Intercept says IBM developed NYPD surveillance tools that let cops pick targets based on skin color

Google launches a Dataset Search Engine for finding Datasets on the Internet

cstar: Spotify’s Cassandra orchestration tool is now open source!

ONNX 1.3 is here with experimental function concept

This self-driving car can drive in its imagination using deep reinforcement learning