Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds

Tech News - Data

1208 Articles
article-image-google-shares-initiatives-towards-enforcing-its-ai-principles-employs-a-formal-review-structure-for-new-projects
Bhagyashree R
20 Dec 2018
3 min read
Save for later

Google shares initiatives towards enforcing its AI principles; employs a formal review structure for new projects

Bhagyashree R
20 Dec 2018
3 min read
Earlier this year, Sundar Pichai shared seven AI principles that Google aims to follow in its work. Google also shared some best practices for building responsible AI. Yesterday, they shared the additional initiative and processes they have introduced to live up to their AI principles. Some of these initiatives include educating people about ethics in technology, introducing a formal review structure for new projects, products, and deals. Educating Googlers on ethical AI Making Googlers aware of the ethical issues: Additional learning material has been added to the Ethics in Technology Practice course that teaches technical and non-technical Googlers about how they can address the ethical issues that arise while at work. In the future, Google is planning to make this course accessible for everyone across the company. Introducing AI Ethics Speaker Series: This series features external experts across different countries, regions, and professional disciplines. So far, eight sessions have been conducted with 11 speakers covering topics from bias in natural language processing (NLP) to the use of AI in criminal justice. AI fairness: A new module on fairness is added to Google’s free Machine Learning Crash Course. This course is available in 11 languages and is being used by more than 21,000 Google employees. The fairness module explores different types of human biases that can corp in the training data and also provide strategies to identify and evaluate their effects. Review structure for new projects, products, and deals Google has employed a formal review structure to check the scaling, severity, and likelihood of best- and worst-case scenarios of new projects, products, and deals. This review structure consists of three core groups: Innovation team: This team consists of user researchers, social scientists, ethicists, human rights specialists, policy and privacy advisors, and legal experts. They are responsible for day-to-day operations and initial assessments. Senior experts: This group consists of senior experts from a range of disciplines across Alphabet Inc.. They provide technological, functional, and application expertise. Council of senior executives: This group handles the decisions that affect multiple products and technologies. Currently, more than 100 reviews have been done under this formal review structure. In the future, Google plans to create an external advisory group, which will comprise of experts from a variety of disciplines. To read more about Google’s initiatives towards ethical AI, check out their official announcement. Google won’t sell its facial recognition technology until questions around tech and policy are sorted Google expands its machine learning hardware portfolio with Cloud TPU Pods (alpha) to effectively train and deploy TensorFlow machine learning models on GCP Google kills another product: Fusion tables
Read more
  • 0
  • 0
  • 2105

article-image-stanford-researchers-introduce-deepsolar-a-deep-learning-framework-that-mapped-every-solar-panel-in-the-us
Bhagyashree R
20 Dec 2018
3 min read
Save for later

Stanford researchers introduce DeepSolar, a deep learning framework that mapped every solar panel in the US

Bhagyashree R
20 Dec 2018
3 min read
Yesterday, researchers from Stanford University introduced DeepSolar, a deep learning framework that analyzes satellite images to identify the GPS location and size of solar panels. Using this framework they have built a comprehensive database containing all the GPS locations and sizes of solar installations in the US. The system was able to identify 1.47 million individual solar installations across the United States, ranging from small rooftop configurations, solar farms, to utility-scale systems. The DeepSolar database is available publicly to aid researchers to extract further insights into solar adoption. This database will also help policymakers in better understanding the correlation between solar deployment and socioeconomic factors such as household income, population density, and education level. How DeepSolar works? DeepSolar uses transfer learning to train a CNN classifier on 366,467 images. These images are sampled from over 50 cities/towns across the US with merely image-level labels indicating the presence or absence of panels. One of the researchers, Rajagopal explained the model to Gizmodo, “The algorithm breaks satellite images into tiles. Each tile is processed by a deep neural net to produce a classification for each pixel in a tile. These classifications are combined together to detect if a system—or part of—is present in the tile.” The deep neural net then identifies which tile is a solar panel. Once the training is complete, the network produces an activation map, which is also known as a heat map. The heat map outlines the panels, which can be used to obtain the size of each solar panel system. Rajagopal further explained how this model gives better efficiency, “A rooftop PV system typically corresponds to multiple pixels. Thus even if each pixel classification is not perfect, when combined you get a dramatically improved classification. We give higher weights to false negatives to prevent them.” What are some of the observations the researchers made? To measure its classification performance the researchers defined two metrics: utilize precision and recall. Utilize precision is the rate of correct decisions among all positive decisions and recall is the ratio of correct decisions among all positive samples. DeepSolar was able to achieve a precision of 93.1% with a recall of 88.5% in residential areas and a precision of 93.7% with a recall of 90.5% in non-residential areas. To measure its size estimation performance they calculated the mean relative error (MRE). It was recorded to be 3.0% for residential areas and 2.1% for non-residential areas for DeepSolar. Future work Currently, the DeepSolar database only covers the contiguous US region. The researchers are planning to expand its coverage to include all of North America, including remote areas with utility-scale solar, and non-contiguous US states. Ultimately, it will also cover other countries and regions of the world. Also, DeepSolar only estimates the horizontal projection areas of solar panels from satellite imagery. In the future, it would be able to infer high-resolution roof orientation and tilt information from street view images. This will give a more accurate estimation of solar system size and solar power generation capacity. To know more in detail, check out the research paper published by Ram Rajagopal et al: DeepSolar: A Machine Learning Framework to Efficiently Construct a Solar Deployment Database in the United States. Introducing remove.bg, a deep learning based tool that automatically removes the background of any person based image within 5 seconds NeurIPS 2018: How machine learning experts can work with policymakers to make good tech decisions [Invited Talk] NVIDIA makes its new “brain for autonomous AI machines”, Jetson AGX Xavier Module, available for purchase
Read more
  • 0
  • 0
  • 3684

article-image-the-district-of-columbia-files-a-lawsuit-against-facebook-for-the-cambridge-analytica-scandal
Prasad Ramesh
20 Dec 2018
2 min read
Save for later

The district of Columbia files a lawsuit against Facebook for the Cambridge Analytica scandal

Prasad Ramesh
20 Dec 2018
2 min read
Karl Racine, the attorney general of the district of Columbia, Washington DC has filed a lawsuit against Facebook nine months after the Cambridge Analytica scandal that affected over 87 million people worldwide. The lawsuit, which was filed on Wednesday, talks mostly about the Cambridge Analytica scandal, which used user data without their permission. An investigation by the New York Times, earlier this week showed that Facebook had given big companies a lot of exceptions to its privacy policies. This made user data available via loopholes to companies including Amazon, Microsoft, Netflix, Spotify, and Sony. This lawsuit breaks the silence or lack of actions from US regulators on Facebook’s disregard of user data privacy. Congress, Silicon Valley critics, and even a global committee had urged Facebook to rethink its business model. Among the recent public hearings, Zuckerberg did not bother to attend the global hearing in the UK, unlike Google CEO Sundar Pichai who was present at the Congress hearing. The lawsuit states that: “Facebook collects and maintains troves of its personal user data, behavior on off its website. Facebook permits third-party developers, including application developers and device makers to access such sensitive information Facebook says that it will take appropriate steps to maintain and protect user data but has failed live up to this commitment.” “Facebook’s lax oversight of user data with respect to third-party applications. It failed to disclose the affected consumers. Facebook’s privacy settings are ambiguous and difficult to understand.” It goes on to state that this indicates Facebook’s relationship with partner companies. The lawsuit also touches on other issues regarding Facebook including its relationships with smartphone makers like BlackBerry, which could access data in ways that could access data irrespective of the users’ settings. Meanwhile, there is no news from the federal regulators. Know more about this lawsuit in detail, on the government of Columbia’s official website. Is Anti-trust regulation coming to Facebook following fake news inquiry made by a global panel in the House of Commons, UK? NYT says Facebook has been disclosing personal data to Amazon, Microsoft, Apple and other tech giants; Facebook denies claims with obfuscating press release British parliament publishes confidential Facebook documents that underscore the growth at any cost culture at Facebook
Read more
  • 0
  • 0
  • 2039
Visually different images

article-image-scipy-1-2-0-is-out-with-a-new-optimization-algorithm-named-shgo-and-more
Savia Lobo
19 Dec 2018
3 min read
Save for later

SciPy 1.2.0 is out with a new optimization algorithm named ‘shgo’ and more!

Savia Lobo
19 Dec 2018
3 min read
Yesterday, the SciPy community released SciPy 1.2.0. This release contains many new features such as numerous bug-fixes, improved test coverage, and better documentation. This release also includes a number of deprecations and API changes. This release requires Python 2.7 or 3.4+ and NumPy 1.8.2 or greater. The functions hyp2f0, hyp1f2 and hyp3f0 in scipy.special have been deprecated. According to the community, “this will be the last SciPy release to support Python 2.7. Consequently, the 1.2.x series will be long term support (LTS) release; we will backport bug fixes until 1 Jan 2020”. Highlights of SciPy 1.2.0 This release has improvements in 1-D root finding with a new solver, toms748, and a new unified interface, root_scalar. SciPy 1.2.0 has new dual_annealing optimization method that combines stochastic and local deterministic searching. This release features a new optimization algorithm named ‘shgo (simplicial homology global optimization)’ for derivative-free optimization problems. A new category of quaternion-based transformations are available in scipy.spatial.transform New improvements in SciPy 1.2.0 scipy.ndimage improvements Proper spline coefficient calculations have been added for the mirror, wrap, and reflect modes of scipy.ndimage.rotate scipy.fftpack improvements Scipy.fftpack now supports DCT-IV, DST-IV, DCT-I, and DST-I orthonormalization. scipy.interpolate improvements scipy.interpolate.pade now accepts a new argument for the order of the numerator. scipy.cluster improvements scipy.cluster.vq.kmeans2 has now gained a new initialization method known as kmeans++. scipy.special improvements The function softmax has been added to scipy.special. scipy.optimize improvements The one-dimensional nonlinear solvers have been given a unified interface scipy.optimize.root_scalar, similar to the scipy.optimize.root interface for multi-dimensional solvers. scipy.optimize.newton can now accept a scalar or an array. scipy.signal improvements Digital filter design functions now include a parameter to specify the sampling rate. Previously, digital filters could only be specified using normalized frequency, but different functions used different scales (e.g. 0 to 1 for butter vs 0 to π for freqz), leading to errors and confusion. scipy.sparse improvements The scipy.sparse.bsr_matrix.tocsr method is now implemented directly instead of converting via COO format, and the scipy.sparse.bsr_matrix.tocsc method is now also routed via CSR conversion instead of COO. The efficiency of both conversions is now improved. scipy.spatial improvements The function scipy.spatial.distance.jaccard has been modified to return 0 instead of np.nan when two all-zero vectors are compared. Support for the Jensen Shannon distance, the square-root of the divergence, has been added under scipy.spatial.distance.jensenshannon. A new category of quaternion-based transformations are available in scipy.spatial.transform, including spherical linear interpolation of rotations (Slerp), conversions to and from quaternions, Euler angles, and general rotation and inversion capabilities (spatial.transform.Rotation), and uniform random sampling of 3D rotations (spatial.transform.Rotation.random). scipy.stats improvements Levy Stable Parameter Estimation, PDF, and CDF calculations are now supported for scipy.stats.levy_stable. stats and mstats now have access to a new regression method, siegelslopes, a robust linear regression algorithm. The Brunner-Munzel test is now available as brunnermunzel in stats and mstats. scipy.linalg improvements scipy.linalg.lapack now exposes the LAPACK routines using the Rectangular Full Packed storage (RFP) for upper triangular, lower triangular, symmetric, or Hermitian matrices; the upper trapezoidal fat matrix RZ decomposition routines are now available as well. To know more about the SciPy 1.2.0 and the its backward incompatible changes, read the release notes on GitHub. Implementing matrix operations using SciPy and NumPy How to Compute Interpolation in SciPy How to compute Discrete Fourier Transform (DFT) using SciPy  
Read more
  • 0
  • 0
  • 2809

article-image-patreon-speaks-out-against-the-protests-over-its-banning-sargon-of-akkad-for-violating-its-rules-on-hate-speech
Natasha Mathur
19 Dec 2018
3 min read
Save for later

Patreon speaks out against the protests over its banning Sargon of Akkad for violating its rules on hate speech

Natasha Mathur
19 Dec 2018
3 min read
Patreon, a popular crowdfunding platform published a post yesterday in defense of its removal of Sargon of Akkad or Carl Benjamin, an English YouTuber famous for his anti-feminist content, last week, over the concerns of him violating its policies on hate speech. Patreon has been receiving backlash ever since from the users and patrons of the website who are calling for a boycott. “Patreon does not and will not condone hate speech in any of its forms. We stand by our policies against hate speech. We believe it’s essential for Patreon to have strong policies against hate speech to build a safe community for our creators and their patrons”, says the Patreon team. Patreon mentioned that it reviews the creations posted by the content creators on other platforms that are funded via Patreon. Since Benjamin is quite popular for his collaborations with other creators, Patreon’s community guidelines, which strictly prohibits hate speech also get applied to those collaborations. According to Patreon’s community guidelines, “Hate speech includes serious attacks, or even negative generalizations, of people based on their race [and] sexual orientation.” Benjamin in one of his interviews on another YouTuber’s channel used racial slurs linked with “negative generalizations of behavior” quite contrasting to how people of those races actually act, to insult others. Apart from using racial slurs, he also used sexual orientation related slurs which violates Patreon’s community guidelines. However, a lot of people are not happy with Patreon’s decision. For instance, Sam Harris, a popular American author, podcast host, and neuroscientist, who had one of the top-grossing accounts (with nearly 9,000 paying patrons at the end of November) on Patreon deleted his account earlier this week, accusing the platform of “political bias”. He wrote “the crowdfunding site Patreon has banned several prominent content creators from its platform. While the company insists that each was in violation of its terms of service, these recent expulsions seem more readily explained by political bias. I consider it no longer tenable to expose any part of my podcast funding to the whims of Patreon’s ‘Trust and Safety” committee’”.     https://twitter.com/SamHarrisOrg/status/1074504882210562048 Apart from banning Carl Benjamin, Patreon also banned Milo Yiannopoulos, a British public speaker and YouTuber with over 839,286 subscribers earlier this month over his association with the Proud Boys, which Patreon has classified as a hate group. https://twitter.com/Patreon/status/1070446085787668480 James Allsup, an alt-right political commentator, and associate of Yiannopoulus', was also banned from Patreon last month for their association with hate groups. Amidst this controversy, some of the top Patreon creators such as Jordan Peterson, a popular Canadian clinical psychologist whose YouTube channel has over 1.6 M subscribers and Dave Rubin, an American libertarian political commentator announced their plans of starting an alternative to Patreon, earlier this week. Peterson said that the new platform will work on a subscriber model similar to Patreon’s, only with few additional features. https://www.youtube.com/watch?v=GWz1RDVoqw4 “We understand some people don’t believe in the concept of hate speech and don’t agree with Patreon removing creators on the grounds of violating our Community Guidelines for using hate speech. We have a different view,” says the Patreon team. Emmanuel Macron teams up with Facebook in a bid to fight hate speech on social media Twitter takes action towards dehumanizing speech with its new policy How IRA hacked American democracy using social media and meme warfare to promote disinformation and polarization: A new report to Senate Intelligence Committee
Read more
  • 0
  • 0
  • 15897

article-image-anthony-levandowski-announces-pronto-ai-and-makes-a-coast-to-coast-self-driving-trip
Sugandha Lahoti
19 Dec 2018
2 min read
Save for later

Anthony Levandowski announces Pronto AI and makes a coast-to-coast self-driving trip

Sugandha Lahoti
19 Dec 2018
2 min read
Anthony Levandowski is back in the self-driving space with a new company. Pronto AI. This Tuesday, he announced on a blog post on Medium that he has completed a trip across the country in a self-driving car without any human intervention. He is also developing a $5,000 aftermarket driver assistance system for semi-trucks, which will handle the steering, throttle, and brakes on the highway. https://twitter.com/meharris/status/1075036576143466497 Previously, Levandowski has been at the center of a controversy between Alphabet’s self-driving car company Waymo and Uber. Levandowski had allegedly taken with him confidential documents over which the companies got into a legal battle. He was briefly barred from the autonomous driving industry during the trial. However, the companies settled the case early this year. After laying low for a while, he is back with Pronto AI and it’s first ADAS ( advanced driver assistance system). “I know what some of you might be thinking: ‘He’s back?’” Levandowski wrote in his Medium post announcing Pronto’s launch. “Yes, I’m back.” Levandowski told the Guardian that he traveled in a self-driving vehicle from San Francisco to New York without human intervention. He didn't touch the steering wheel or pedals — except for periodic rest stops — for the full 3,099 miles. He posted a video that shows a portion of the drive, though it's hard to fact-check the full journey. The car was a modified Toyota Prius which used only video cameras, computers, and basic digital maps to make the cross-country trip. In the medium blog post, he also announced the development of a new camera-based ADAS. Named Copilot by Pronto, it delivers advanced features, built specifically for Class 8 vehicles, with driver comfort and safety top of mind. It will also offer lane keeping, cruise control and collision avoidance for commercial semi-trucks and will be rolled out in early 2019. Alphabet’s Waymo to launch the world’s first commercial self-driving cars next month Apex.AI announced Apex.OS and Apex.Autonomy for building failure-free autonomous vehicles Uber manager warned the leadership team of the inadequacy of safety procedures in their prototype robo-taxis early March, reports The Information
Read more
  • 0
  • 0
  • 3046
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at €14.99/month. Cancel anytime
article-image-introducing-remove-bg-a-deep-learning-based-tool-that-automatically-removes-the-background-of-any-person-based-image-within-5-seconds
Amrata Joshi
18 Dec 2018
3 min read
Save for later

Introducing remove.bg, a deep learning based tool that automatically removes the background of any person based image within 5 seconds

Amrata Joshi
18 Dec 2018
3 min read
Yesterday, Benjamin Groessing, a web consultant and developer at byteq, released remove.bg, a tool built on python, ruby and deep learning. This tool automatically removes the background of any image within 5 seconds. It uses various custom algorithms for the processing of the image. https://twitter.com/hammer_flo_/status/1074914463726350336 It is a free service and users don’t have to manually select the background/foreground layers to separate them. One can simply select an image and instantly download the resulting image with the background removed. Features of remove.bg Personal and professional use Remove.bg can be used by graphic designer, photographer or selfie lover for removing backgrounds. Saves time and money It saves time as it is automated and it is free of cost. 100% Automatic Apart from the image file, this release doesn’t require inputs such as selecting pixels, marking persons, etc. How does remove.bg work? https://twitter.com/begroe/status/1074645152487129088 Remove.bg uses AI technology for detecting foreground layers and separating them from the background. It uses additional algorithms for improving fine details and preventing color contamination. The AI detects persons as foreground and everything else as background. So, it only works if there is at least one person in the image. Users can upload images of any resolution but for performance reasons, the output image has been limited to 500 × 500 pixels. Privacy in remove.bg User images are uploaded through a secure SSL/TLS-encrypted connection. These images are processed and the result is temporarily stored till the time a user can download them. After which, approximately an hour later, these image files get deleted. Privacy message on the official website of remove.bg states, “We do not share your images or use them for any other purpose than removing the background and letting you download the result.” What can be expected from the next release? The next set of releases might support other kinds of images such as product images. The team at Remove.bg might also release an easy-to-use API. Users are very excited about this release and the technology used behind it. Many users are comparing it with the portrait mode on iPhone X. Though it is not that fast but users are still liking it. https://twitter.com/Baconbrix/status/1074805036264316928 https://twitter.com/hammer_flo_/status/1074914463726350336 But how strong is remove.bg with regards to privacy is a bigger question. Though the website gives a privacy note at the end but it will take more to win the user’s trust. The images uploaded to remove.bg’ cloud might be at risk. How strong is the security and what preventive measures have they taken? These are few of the questions that might bother many. To have a look at the ongoing discussion on remove.bg, check out Benjamin Groessing’s AMA twitter thread. Facebook open-sources PyText, a PyTorch based NLP modeling framework Deep Learning Indaba presents the state of Natural Language Processing in 2018 NYU and AWS introduce Deep Graph Library (DGL), a python package to build neural network graphs
Read more
  • 0
  • 0
  • 14820

article-image-france-to-levy-digital-services-tax-on-big-tech-companies-like-google-apple-facebook-amazon-in-the-new-year
Savia Lobo
18 Dec 2018
3 min read
Save for later

France to levy digital services tax on big tech companies like Google, Apple, Facebook, Amazon in the new year

Savia Lobo
18 Dec 2018
3 min read
At a press conference yesterday, French Economy Minister Bruno Le Maire announced that France would levy a new tax on big tech companies including Google, Apple, Facebook, and Amazon, also known as GAFA, w.e.f. January 1, 2019. This tax is estimated to bring in €500m, or about $567 million in the coming year. Le Maire told France24 television, “I am giving myself until March to reach a deal on a European tax on the digital giants. If the European states do not take their responsibilities on taxing the GAFA, we will do it at a national level in 2019.” In an interview with Reuters and a small group of European newspapers, Le Maire said, “We want a fair taxation of digital giants that creates value in Europe in 2019”. France, along with Germany’s help had proposed a comprehensive digital services tax (DST) to cover all 28 EU member states. However, Ireland dismissed the move stating that this would aggravate the US-EU trade intentions. Dublin also said this bloc should happen only after the Organisation for Economic Co-operation and Development (OECD) had presented its tax proposals in 2019. Le Maire, however, said that France would press ahead alone with the tax. In March 2018, the European Commission published a proposal for a 3% tax on tech giants with global revenues north of €750m ($850 million USD) per year, and EU revenue above €50m (about $57 million). But with disagreements by some member states, including Ireland and the Netherlands, on how to move forward with such a tax, the process has been stalled. Per Le Maire, “The digital giants are the ones who have the money." The companies "make considerable profits thanks to French consumers, thanks to the French market, and they pay 14 percentage points of tax less than other businesses.” In October, British Chancellor Philip Hammond announced in the Budget that he plans to introduce a digital services tax from April 2020 following a consultation. The Chancellor's office has suggested that the tax would generate at least 400 million pounds ($505 million) per year. According to Reuters, “President Emmanuel Macron’s government has proposed taxing the tech giants on revenues rather than profits, to get around the problem that the companies shift the profits from where they are earned to low tax jurisdictions.” France and Germany in their alternative plan at a meeting of EU finance ministers proposed levying a 3 percent tax on digital advertising from Google and Facebook, which together account for about 75 percent of digital advertising, starting in 2021. Ministers asked the European Commission to work on the new proposal and present its findings to them in January or February. After the meeting, Le Maire said, "It's a first step in the right direction, which in the coming months should make the taxation of digital giants a possibility." To know more about this in detail, visit France24’s complete coverage. Australia’s Assistance and Access (A&A) bill, popularly known as the anti-encryption law, opposed by many including the tech community Amazon addresses employees dissent regarding the company’s law enforcement policies at an all-staff meeting, in a first Senator Ron Wyden’s data privacy law draft can punish tech companies that misuse user data  
Read more
  • 0
  • 0
  • 2307

article-image-how-facebook-uses-hyperloglog-in-presto-to-speed-up-cardinality-estimation
Bhagyashree R
18 Dec 2018
3 min read
Save for later

How Facebook uses HyperLogLog in Presto to speed up cardinality estimation

Bhagyashree R
18 Dec 2018
3 min read
Yesterday, Facebook shared how they use HyperLogLog (HLL) in Presto to do computational intensive operations like estimating distinct values within a huge dataset. With this implementation, they were able to achieve up to 1,000x speed improvements while working on count-distinct problems. What is HyperLogLog? HyperLogLog is an algorithm designed for estimating the number of unique values within a huge dataset, which is also known as cardinality. To produce an estimate of the cardinality it uses an auxiliary memory of m units and performs a single pass over the data. This algorithm is an improvised version of the previously known cardinality estimator, LogLog. Facebook uses HypeLogLog in scenarios like determining the number of distinct people visiting Facebook in the past week using a single machine. To even further speed up these type of queries they implemented HLL in Presto, which is an open source distributed SQL query engine. Presto is designed for running interactive analytic queries against data sources of all sizes. Using HLL, the same calculation can be performed in 12 hours with less than 1 MB of memory. Facebook highlights that they have seen great improvements, with some queries being run within minutes, including those used to analyze thousands of A/B tests. Presto’s HLL implementation The implementation of HLL data structures in Presto consists of two layout formats: sparse and dense. To save on memory the storage starts off with a sparse layout and when the input data structure goes over the prespecified memory limit for the sparse format, Presto switches to the dense layout automatically. The sparse layout is used to get almost the exact count in low-cardinality datasets, for instance, the number of distinct countries. The dense layout is used in the cases where the cardinality is high such as the number of distinct users. There is an HYPERLOGLOG data type in Presto. Those users who prefer a single format so that they can process the output structure in other platforms such as Python, there is another data type called P4HYPERLOGLOG, which starts and stays strictly as a dense HLL. To read more in detail about how Facebook uses HLL, check out their article. Facebook open-sources PyText, a PyTorch based NLP modeling framework Facebook contributes to MLPerf and open sources Mask R-CNN2Go, its CV framework for embedded and mobile devices Australia’s ACCC publishes a preliminary report recommending Google Facebook be regulated and monitored for discriminatory and anti-competitive behavior
Read more
  • 0
  • 0
  • 3048

article-image-facebook-open-sources-pytext-a-pytorch-based-nlp-modeling-framework
Amrata Joshi
17 Dec 2018
4 min read
Save for later

Facebook open-sources PyText, a PyTorch based NLP modeling framework

Amrata Joshi
17 Dec 2018
4 min read
Last week, the team at Facebook AI Research announced that they are open sourcing  PyText NLP framework. PyText, a deep-learning based NLP modeling framework, is built on PyTorch. Facebook is outsourcing some of the conversational AI techs for powering the Portal video chat display and M suggestions on Facebook Messenger. https://twitter.com/fb_engineering/status/1073629026072256512 How is PyText useful for Facebook The PyText framework is used for tasks like document classification, semantic parsing, sequence tagging and multitask modeling. This framework easily fits into research and production workflows and emphasizes on robustness and low-latency to meet Facebook’s real-time NLP needs. PyText is also responsible for models powering more than a billion daily predictions at Facebook. This framework addresses the conflicting requirements of enabling rapid experimentation and serving models at scale by providing simple interfaces and abstractions for model components. It uses PyTorch’s capabilities of exporting models for inference through optimized Caffe2 execution engine. Features of PyText PyText features production-ready models for various NLP/NLU tasks such as text classifiers, sequence taggers, etc. PyText comes with a distributed-training support, built on the new C10d backend in PyTorch 1.0. It comes with training support and also features extensible components that help in creating new models and tasks. The framework’s modularity, allows it to create new pipelines from scratch and modify the existing workflows. It comes with a simplified workflow for faster experimentation. It gives an access to a rich set of prebuilt model architectures for text processing and vocabulary management. Serve as an end-to-end platform for developers. Its modular structure helps engineers to incorporate individual components into existing systems. Added support for string tensors to work efficiently with text in both training and inference. PyText for NLP development PyText improves the workflow for NLP and supports distributed training for speeding up NLP experiments that require multiple runs. Easily portable The PyText models can be easily shared across different organizations in the AI community. Prebuilt models With a model focused on NLP tasks, such as text classification, word tagging, semantic parsing, and language modeling, this framework makes it possible to use pre-built models on new data, easily. Contextual models For improving the conversational understanding in various NLP tasks, PyText uses the contextual information, such as an earlier part of a conversation thread. There are two contextual models in PyText, a SeqNN model for intent labeling tasks and a Contextual Intent Slot model for joint training on both tasks. PyText exports models to Caffe2 PyText uses PyTorch 1.0’s capability to export models for inference through the optimized Caffe2 execution engine. Native PyTorch models require Python runtime, which is not scalable because of the multithreading limitations of Python’s Global Interpreter Lock. Exporting to Caffe2 provides efficient multithreaded C++ backend for serving huge volumes of traffic efficiently. PyText’s capabilities to test new state-of-the-art models will be improved further in the next release. Since, putting sophisticated NLP models on mobile devices is a big challenge, the team at Facebook AI research will work towards building an end-to-end workflow for on-device models. The team plans to include supporting multilingual modeling and other modeling capabilities. They also plan to make models easier to debug, and might also add further optimizations for distributed training. “PyText has been a collaborative effort across Facebook AI, including researchers and engineers focused on NLP and conversational AI, and we look forward to working together to enhance its capabilities,” said the Facebook AI research team. Users are excited about this news and want to explore more. https://twitter.com/ezylryb_/status/1073893067705409538 https://twitter.com/deliprao/status/1073671060585799680 To know about this in detail, check out the release notes on GitHub. Facebook contributes to MLPerf and open sources Mask R-CNN2Go, its CV framework for embedded and mobile devices Facebook retires its open source contribution to Nuclide, Atom IDE, and other associated repos Australia’s ACCC publishes a preliminary report recommending Google Facebook be regulated and monitored for discriminatory and anti-competitive behavior
Read more
  • 0
  • 0
  • 3905
article-image-nvidia-makes-its-new-brain-for-autonomous-ai-machines-jetson-agx-xavier-module-available-for-purchase
Natasha Mathur
17 Dec 2018
3 min read
Save for later

NVIDIA makes its new “brain for autonomous AI machines”, Jetson AGX Xavier Module, available for purchase

Natasha Mathur
17 Dec 2018
3 min read
NVIDIA made Jetson AGX Xavier module, its new “powerful brain” for autonomous AI machines, available for purchase worldwide, last week, starting at volume pricing $1099 for batches of 1,000 units or more.   Jetson AGX Xavier module is the new addition to the Jetson TX2 and TX1 developer kits family. It is aimed at providing high-level performance and will allow the companies to go into volume production of applications that are developed on the Jetson AGX Xavier developer kit, that was released back in September. Jetson AGX Xavier module consumes as little as 10-watt power and delivers 32 trillion computer operations per second (TOPS). It is supported by a 512-core Volta GPU with Tensor Cores and an 8-core ARM v8.2 64-bit CPU. It also comes with two NVDLA deep learning chips and dedicated image, video and vision processors. Other than that, it is supported by the NVIDIA’s JetPack and DeepStream software development kits. JetPack is NVIDIA’s SDK for autonomous machines that includes support for AI, computer vision, multimedia and more. The DeepStream SDK will enable streaming analytics, and developers can build multi-camera and multi-sensor applications to detect and identify objects such as vehicles, pedestrians, and cyclists. “These SDKs save developers and companies time and money while making it easy to add new features and functionality to machines to improve performance. With this combination of new hardware and software, it’s now possible to deploy AI-powered robots, drones, intelligent video analytics applications and other intelligent devices at scale,” mentions the NVIDIA team. Jetson AGX Xavier module has already been put to use by Oxford Nanopore, a U.K. medical technology startup, where it handles DNA sequencing in real time with the MinION, a powerful handheld DNA sequencer. Also, Japan’s DENSO, a global auto parts maker, believes that Jetson AGX Xavier will be a key platform to helping it introduce AI to its auto parts manufacturing factories where it will help with boosting productivity and efficiency. “Developers can use Jetson AGX Xavier to build the autonomous machines that will solve some of the world’s toughest problems, and help transform a broad range of industries. Millions are expected to come onto the market in the years ahead”, says the NVIDIA team. NVIDIA open sources its game physics simulation engine, PhysX, and unveils PhysX SDK 4.0 NVIDIA leads the AI hardware race. But which of its GPUs should you use for deep learning NVIDIA shows off GeForce RTX, real-time raytracing GPUs, as the holy grail of computer graphics to gamers
Read more
  • 0
  • 0
  • 2541

article-image-numpy-drops-python-2-support-now-you-need-python-3-5-or-later
Prasad Ramesh
17 Dec 2018
2 min read
Save for later

NumPy drops Python 2 support. Now you need Python 3.5 or later.

Prasad Ramesh
17 Dec 2018
2 min read
In a GitHub pull request last week, the NumPy community decided to remove support for Python 2.7. Python 3.4 support will also be dropped with this pull request. So now, to use NumPy 1.17 and newer versions, you will need Python 3.5 or later. NumPy has been supporting both Python versions since 2010. This move doesn't come as a surprise with the Python core team itself dropping support for Python 2 in 2020. The NumPy team had mentioned that this move comes in “Python 2 is an increasing burden on our limited resources”. The discussion to drop Python 2 support in NumPy started almost a year ago. Running pip install numpy on Python 2 will still install the last working version. But here on now, it may not contain the latest features as released for Python 3.5 or higher. However, NumPy on Python 2 will still be supported until December 31, 2019. After January 1, 2020, it may not contain the newest bug fixes. The Twitter audience sees this as a welcome move: https://twitter.com/TarasNovak/status/1073262599750459392 https://twitter.com/esc___/status/1073193736178462720 A comment on Hacker News reads: “Let's hope this move helps with the transitioning to Python 3. I'm not a Python programmer myself, but I'm tired of things getting hairy on Linux dependencies written in Python. It almost seems like I always got to have a Python 2 and a Python 3 version of some packages so my system doesn't break.” Another one reads: “I've said it before, I'll say it again. I don't care for everything-is-unicode-by-default. You can take my Python 2 when you pry it from my cold dead hands.” Some researchers who use NumPy and SciPy stick Python 2, this move from the NumPy team will help in getting everyone to work on a single version. One single supported version will sure help with the fragmentation. Often, Python developers find themselves in a situation where they have one version installed and a specific module is available/works properly in another version. Some also argue about stability, that Python 2 has greater stability and x or y feature. But the general sentiment is more supportive of adopting Python 3. Introducing numpywren, a system for linear algebra built on a serverless architecture NumPy 1.15.0 release is out! Implementing matrix operations using SciPy and NumPy  
Read more
  • 0
  • 0
  • 12077

article-image-google-wont-sell-its-facial-recognition-technology-until-questions-around-tech-and-policy-are-sorted
Savia Lobo
14 Dec 2018
4 min read
Save for later

Google won’t sell its facial recognition technology until questions around tech and policy are sorted

Savia Lobo
14 Dec 2018
4 min read
Google, yesterday released a blog post titled ‘AI for Social Good in Asia Pacific’  where they mentioned they have “chosen not to offer general-purpose facial recognition APIs before working through important technology and policy questions” According to senior vice president of Global Affairs Kent Walker, "Like many technologies with multiple uses, facial recognition merits careful consideration to ensure its use is aligned with our principles and values, and avoids abuse and harmful outcomes," Google said. "We continue to work with many organizations to identify and address these challenges, and unlike some other companies, Google Cloud has chosen not to offer general-purpose facial recognition APIs before working through important technology and policy questions." Google, backed away from the military drone project and published ethical AI principles that prohibit weapons and surveillance usage, which face recognition falls under, in light of Project Maven with the U.S. Department of Defense. The facial recognition technology has raised in popularity after finding popular use cases such as from entertainment industry to law enforcement agencies. Many companies have also faced a lot of pushback on how well they have handled their own technologies and whom they have sold it to. According to Engadget, “Amazon, for instance, has come under fire for selling its Rekognition software to law enforcement groups, and civil rights groups, as well as its own investors and employees, have urged the company to stop providing its facial recognition technology to police. In a letter to CEO Jeff Bezos, employees warned about Rekognition's potential to become a surveillance tool for the government, one that would "ultimately serve to harm the most marginalized.” The American Civil Liberties Union issued a statement in support of today’s development from Google, stating, “This is a strong first step. Google today demonstrated that, unlike other companies doubling down on efforts to put dangerous face surveillance technology into the hands of law enforcement and ICE, it has a moral compass and is willing to take action to protect its customers and communities. Google also made clear that all companies must stop ignoring the grave harms these surveillance technologies pose to immigrants and people of color, and to our freedom to live our lives, visit a church, or participate in a protest without being tracked by the government.” Amazon had also pitched its Rekognition software to ICE in October. “Yesterday during a hearing with the New York City Council, an Amazon executive didn't deny having a contract with the agency, saying in response to a question about its involvement with ICE that the company provides Rekognition "to a variety of government agencies." Lawmakers in the US have now asked Amazon for more information about Rekognition multiple times.” “Microsoft also shared six principles it has committed to regarding its own facial recognition technology. Among those guidelines is a pledge to treat people fairly and to provide clear communication about the technology's capabilities and limitations”, says Engadget. ACLU's Nicole Ozer said, “Google today demonstrated that, unlike other companies doubling down on efforts to put dangerous face surveillance technology into the hands of law enforcement and ICE, it has a moral compass and is willing to take action to protect its customers and communities. Google also made clear that all companies must stop ignoring the grave harms these surveillance technologies pose to immigrants and people of color, and to our freedom to live our lives, visit a church, or participate in a protest without being tracked by the government.” To know more about this in detail, visit Google’s official blogpost. Google AI releases Cirq and Open Fermion-Cirq to boost Quantum computation Google expands its machine learning hardware portfolio with Cloud TPU Pods (alpha) to effectively train and deploy TensorFlow machine learning models on GCP ‘Istio’ available in beta for Google Kubernetes Engine, will accelerate app delivery and improve microservice management
Read more
  • 0
  • 0
  • 2394
article-image-cockroach-labs-2018-cloud-report-aws-outperforms-gcp-hands-down
Melisha Dsouza
14 Dec 2018
5 min read
Save for later

Cockroach Labs 2018 Cloud Report: AWS outperforms GCP hands down

Melisha Dsouza
14 Dec 2018
5 min read
While testing the features for CockroachDB 2.1, the team discovered that AWS offered 40% greater throughput than GCP. To understand the reason for this result, the team compared GCP and AWS on TPC-C performance (e.g., throughput and latency), CPU, Network, I/O, and cost. This has resulted in CockroachDB releasing a 2018 Cloud Report to help customers decide on which cloud solution to go with based on the most commonly faced questions, such as should they use Amazon Web Services (AWS), Google Cloud Platform (GCP) or Microsoft Azure? How should they tune their workload for different offerings? Which of the platforms are more reliable? Note: They did not test Microsoft Azure due to bandwidth constraints but will do so in the near future. The tests conducted For GCP, the team chose the n1-standard-16 machine with Intel Xeon Scalable Processor (Skylake) in the us-east region and for AWS  they chose the latest compute-optimized AWS instance type, c5d.4xlarge instances, to match n1-standard-16, because they both have 16 cpus and SSDs. #1 TPC-C Benchmarking test The team tested the workload performance by using TPC-C. The results were surprising as CockroachDB 2.1 achieves 40% more throughput (tpmC) on TPC-C when tested on AWS using c5d.4xlarge than on GCP via n1-standard-16. They then tested the TPC-C against some of the most popular AWS instance types. Taking the testing a step ahead, they focused on the higher performing c5 series with SSDs, EBS-gp2, and EBS-io1 volume types. The AWS Nitro System present in c5and m5 series offers approximately similar or superior performance when compared to a similar GCP instance. The results were clear: AWS wins on TPC-C benchmark. #2 CPU Experiment The team chose stress-ng as according to them, it offered more benchmarks and provided more flexible configurations as compared to sysbench benchmarking test. On running the Stress-ng command stress-ng --metrics-brief --cpu 16 -t 1m five times on both AWS and GCP, they found that   AWS offered 28% more throughput (~2,900) on stress-ng than GCP. #3 Network throughput and latency test The team measured network throughput using a tool called iPerf and latency via another tool PING. They have given a detailed setup of the iPerf tool used for this experiment in a blog post. The tests were run 4 times, each for AWS and GCP. The results once again showed AWS was better than GCP. GCP showed a fairly normal distribution of network throughput centered at ~5.6 GB/sec. Throughput ranges from 4.01 GB/sec to 6.67 GB/sec, which according to the team is “a somewhat unpredictable spread of network performance”, reinforced by the observed average variance for GCP of 0.487 GB/sec. AWS, offers significantly higher throughput, centered on 9.6 GB/sec, and providing a much tighter spread between 9.60 GB/sec and 9.63 GB/sec when compared to GCP. On checking network throughput variance, for AWS, the variance is only 0.006 GB/sec. This indicates that the GCP network throughput is 81x more variable when compared to AWS. The network latency test showed that, AWS has a tighter network latency than GCP. AWS’s values are centered on an average latency, 0.057 ms. AWS offers significantly better network throughput and latency with none of the variability present in GCP. #4 I/O Experiment The team tested I/O using a configuration of Sysbench that simulates small writes with frequent syncs for both write and read performance. This test measures throughput based on a fixed set of threads, or the number of items concurrently writing to disk. The write performance showed that AWS consistently offers more write throughput across all thread variance from 1 thread up to 64. In fact, it can be as high as 67x difference in throughput. AWS also offers better average and 95th percentile write latency across all thread tests. At 32 and 64 threads, GCP provides marginally more throughput. For read latency, AWS tops the charts for up to 32 threads. At 32 and 64 threads GCP and AWS split the results. The test also shows that GCP offers a marginally better performance with similar latency to AWS for read performance at 32 threads and up. The team also used the no barrier method of writing directly to disk without waiting for the write cache to be flushed. The result for this were reverse as compared to the above experiments. They found that GCP with no barrier speeds things up by 6x! On AWS, no barrier (vs. not setting no barrier) is only a 25% speed up. #5 Cost Considering AWS outperformed GCP at the TPC-C benchmarks, the team wanted to check the cost involved on both platforms. For both clouds we assumed the following discounts available: On GCP :a  three-year committed use price discount with local SSD in the central region. On AWS : a three-year standard contract paid up front. They found that GCP is more expensive as compared to AWS, given the performance it has shown in the tests conducted. GCP costs 2.5 times more than AWS per tpmC. In response to this generated report, Google Cloud developer advocate, Seth Vargo, posted a comment on Hacker News assuring users that Google’s team would look into the tests and conduct their own benchmarking to provide customers with the much needed answers to the questions generated by this report. It would be interesting to see the results GCP comes up with in response to this report. Head over to cockroachlabs.com for more insights on the tests conducted. CockroachDB 2.0 is out! Cockroach Labs announced managed CockroachDB-as-a-Service Facebook GEneral Matrix Multiplication (FBGEMM), high performance kernel library, open sourced, to run deep learning models efficiently
Read more
  • 0
  • 0
  • 2520

article-image-devart-releases-standard-edition-of-dbforge-studio-for-postgresql
Sugandha Lahoti
14 Dec 2018
2 min read
Save for later

Devart releases standard edition of dbForge Studio for PostgreSQL

Sugandha Lahoti
14 Dec 2018
2 min read
Devart has released a standard version of dbForge Studio for PostgreSQL with new enhancements. The dbForge Studio for PostgreSQL is a GUI tool for managing and developing databases and objects in PostgreSQL. Developers can use it for creating and executing queries and adjusting the code to their requirements. Now, the dbForge Studio for PostgreSQL is available in two editions. The free Express edition has basic functionality and the standard edition has advanced features. The Standard edition can be evaluated at no cost for 30 days. When this trial period expires, users can purchase a license to continue using the software or it will be limited to a free Express edition. The Standard edition includes the following exclusive features: Data import and export: Developers can now fill PostgreSQL databases with external source data and migrating data between systems. Master-Detail browser: Provides simultaneous data view in related tables. It is also convenient for quick data analysis and locating specific records and logical errors in the database. Data reports: PostgreSQL Report Builder with support for chart plotting converts your data into a good-looking report. Pivot table: Visual Pivot Table Designer, advanced filtering, visual data presented in a graph makes data easier to read, understand, and analyze. SQL snippets: Code snippets help to boost SQL code typing. Query Execution history: The Execution History window provides an easy and convenient way to view, search, edit executed queries. Execute Large script: Large scripts can be executed with help of handy Execute Script Wizard without opening them in the SQL editor and loading the whole script in memory. SQL document: Context prompt in the FROM list of the SELECT queries Full code completion support for the SELECT statement Connectivity: SSH connection Connectivity support for PostgreSQL 11.x, 10.x, 8.3, 8.3, 8.1, and elephantsql.com Other improvements: Digital signature for the installation file FIPS compliance Visit the devart blog for more information about dbForge Studio for PostgreSQL. PipelineDB 1.0.0, the high performance time-series aggregation for PostgreSQL, released! Citus Data to donate 1% of its equity to non-profit PostgreSQL organizations PostgreSQL 11 is here with improved partitioning performance, query parallelism, and JIT compilation
Read more
  • 0
  • 0
  • 2690