Data | 0 articles | Tech News, Tutorials & Expert Insights

article-image-machine-learning-experts-on-how-we-can-use-machine-learning-to-mitigate-and-adapt-to-the-changing-climate

18 Jun 2019

5 min read

Machine learning experts on how we can use machine learning to mitigate and adapt to the changing climate

18 Jun 2019

Last week, a team of machine learning experts published a paper titled “Tackling Climate Change with Machine Learning”. The paper highlights how machine learning can be used to reduce greenhouse gas emissions and help society adapt to a changing climate. https://twitter.com/hardmaru/status/1139340463486320640 Climate change and its consequences are becoming more apparent to us day by day. And, one of the most significant ones is global warming, which is mainly caused by the emission of greenhouse gases. The report suggests that we can mitigate this problem by making changes to the existing electricity systems, transportation, buildings, industry, and land use. For adapting to the changing climate we need climate modeling, risk prediction, and planning for resilience and disaster management. This 54-page report lists various steps involving machine learning that can help us adapt and mitigate the problem of greenhouse gas emissions. In this article, we look at how machine learning and deep learning can be used for reducing greenhouse gas emissions from electricity systems: Electricity systems A quarter of human-caused greenhouse gas emissions come from electricity systems. To minimize this we need to switch to low-carbon electricity sources. Additionally, we should also take steps to reduce emissions from existing carbon-emitting power plants. There are two types of low-carbon electricity sources: variable and controllable: Variable sources Variable sources are those that fluctuate based on external factors, for instance, the energy produced by solar panels depend on the sunlight. Power generation and demand forecasting Though ML and deep learning methods have been applied to power generation and demand forecasting previously, it was done using domain-agnostic techniques. For instance, using clustering techniques on households or game theory, optimization, regression, or online learning to predict disaggregated quantities from aggregate electricity signals. This study suggests that future ML algorithms should incorporate domain-specific insights. They should be created using the innovations in climate modeling and weather forecasting and in hybrid-plus-ML modeling techniques. These techniques will help in improving both short and medium-term forecasts. ML models can be used to directly optimize for system goals. Improving scheduling and flexible demand ML can play an important role in improving the existing centralized process of scheduling and dispatching by speeding up power system optimization problems. It can be used to fit fast function approximators to existing optimization problems or provide good starting points for optimization. Dynamic scheduling and safe reinforcement learning can also be used to balance the electric grid in real time to accommodate variable generation or demand. ML or other simpler techniques can enable flexible demand by making storage and smart devices automatically respond to electricity prices. To provide appropriate signals for flexible demand, system operators can design electricity prices based on, for example, forecasts of variable electricity or grid emissions. Accelerated science for materials Many scientists are working to introduce new materials that are capable of storing or harnessing energy from variable natural resources more efficiently. For instance, solar fuels are synthetic fuels produced from sunlight or solar heat. It can capture solar energy when the sun is up and then store this energy for later use. However, coming up with new materials can prove to be very slow and imprecise. There are times when human experts do not understand the physics behind these materials and have to manually apply heuristics to understand a proposed material’s physical properties. ML techniques can prove to be helpful in such cases. They can be used to automate this process by combining “heuristics with experimental data, physics, and reasoning to apply and even extend existing physical knowledge.” Controllable sources Controllable sources can be turned on and off, for instance, nuclear or geothermal plants. Nuclear power plants Nuclear power plants are very important to meet climate change goals. However, they do pose some really significant challenges including public safety, waste disposal, slow technological learning, and high costs. ML, specifically deep networks can be used to reduce maintenance costs. They can speed up inspections by detecting cracks and anomalies from image and video data or by preemptively detecting faults from high-dimensional sensor and simulation data. Nuclear fusion reactors Nuclear fusion reactors are capable of producing safe and carbon-free electricity with the help of virtually limitless hydrogen fuel supply. But, right now they consume more energy that they produce. A lot of scientific and engineering research is still needed to be done before we can use nuclear fusion reactors to facilitate users. ML can be used to accelerate this research by guiding experimental design and monitoring physical processes. As nuclear fusion reactors have a large number of tunable parameters, ML can help prioritize which parameter configurations should be explored during physical experiments. Reducing the current electricity system climate impacts Reducing life-cycle fossil fuel emissions While we work towards bringing low-carbon electricity systems to society, it is important to reduce emissions from the current fossil fuel power generation. ML can be used to prevent the leakage of methane from natural gas pipelines and compressor stations. Earlier, people have used sensor and satellite data to proactively suggest pipeline maintenance or detect existing leaks. ML can be used to improve and scale the existing solutions. Reducing system waste As electricity is supplied to the consumers, some of it gets lost as resistive heat on electricity lines. While we cannot eliminate these losses completely, it can be significantly mitigated to reduce waste and emissions. ML can help prevent avoidable losses through predictive maintenance by suggesting proactive electricity grid upgrades. To know more in detail about how machine learning can help reduce the impact of climate change, check out the report. Deep learning models have massive carbon footprints, can photonic chips help reduce power consumption? Now there’s a CycleGAN to visualize the effects of climate change. But is this enough to mobilize action? ICLR 2019 Highlights: Algorithmic fairness, AI for social good, climate change, protein structures, GAN magic, adversarial ML and much more

0
0
3026

article-image-facebook-researchers-open-source-ai-habitat-for-embodied-ai-research-and-introduced-replica-a-dataset-of-indoor-space-reconstructions

Amrata Joshi

17 Jun 2019

6 min read

Facebook researchers open-source AI Habitat for embodied AI research and introduced Replica, a dataset of indoor space reconstructions

Amrata Joshi

17 Jun 2019

6 min read

Last week, the team at Facebook AI open-sourced AI Habitat which is a new simulation platform for embodied AI research. The AI Habitat is designed to train embodied agents, eg, virtual robots in photo-realistic 3D environments. https://twitter.com/DhruvBatraDB/status/1100791464513040384 The blog post reads, “Our goal in sharing AI Habitat is to provide the most universal simulator to date for embodied research, with an open, modular design that’s both powerful and flexible enough to bring reproducibility and standardized benchmarks to this subfield.” Last week the Facebook AI team also shared Replica, a dataset of reconstructions of various indoor spaces. The 3D reconstructions could be of a staged apartment, retail store, or any indoor spaces. Currently, AI Habitat can run Replica’s state-of-the-art reconstructions and can also work with existing 3D assets created for embodied research including the Gibson and Matterport3D data sets. AI Habitat’s modular software stack involves the principles of compatibility and flexibility. The blog reads, “We incorporated direct feedback from the research community to develop this degree of flexibility, and also pushed the state of the art in training speeds, making the simulator able render environments orders of magnitude faster than previous simulators.” This platform has already been tested and is now available. The Facebook team recently hosted an autonomous navigation challenge that ran on the platform. The winning teams will be awarded the Google Cloud credits at the Habitat Embodied Agents workshop at CVPR 2019. AI Habitat is also the part of Facebook AI’s ongoing effort for creating systems that rely less on large annotated data sets that are used for supervised training. The blog reads, “As more researchers adopt the platform, we can collectively develop embodied AI techniques more quickly, as well as realize the larger benefits of replacing yesterday’s training data sets with active environments that better reflect the world we’re preparing machine assistants to operate in.” The Facebook AI researchers had proposed a paper, Habitat: A Platform for Embodied AI Research in April this year. The paper highlights the set of design requirements the team sought to fulfill. Have a look at a few of the requirements below: Performant rendering engine: The team aimed for a resource efficient rendering engine for producing multiple channels of visual information including RGB (Red, Green, Blue), depth, semantic instance segmentation, surface normals, etc for multiple operating agents. Scene dataset ingestion API: Next, there was a requirement for making the platform agnostic to 3D scene datasets that allow users to use their own datasets. So, the team then aimed for a dataset ingestion API. Agent API: It helps users to specify parameterized embodied agents with well-defined geometry, physics, as well as actuation characteristics. Sensor suite API: It helps in the specification of arbitrary numbers of parameterized sensors including, RGB, depth, contact, GPS, compass sensors that are attached to each agent. AI Habitat features a stack of three layers With AI Habitat, the team aims to retain the simulation-related benefits that past projects demonstrated including speeding experimentations and RL-based training, and further applying them to a widely compatible and realistic platform. AI Habitat features a stack of three modular layers, where each of them can be configured or even replaced to work with different kinds of agents, evaluation protocols, training techniques, and environments. The simulation engine known as the Habitat-Sim forms the base of the stack including built-in support for existing 3D environment data sets, including Gibson, Matterport3D, etc. Habitat-Sim can also be used in abstracting the details of specific data sets and further applying them across simulations. Habitat-API is the second layer in AI Habitat’s software stack which is a high-level library that defines tasks such as visual navigation and question answering. This API incorporates the use of additional data, configurations and further simplifies and standardizes the training as well as evaluation of embodied agents. The third and final layer of this platform where users specify training and evaluation parameters, such as how difficulty might ramp across multiple runs and further ask about what metrics to focus on. According to the researchers, the future of AI Habitat and embodied AI research lies in the simulated environments that are indistinguishable from real life. Replica data sets by FRL researchers In the case of Replica, the FRL (Facebook Reality Labs) researchers created the data set consisting of scans of 18 scenes that range in size, from an office conference room to a two-floor house. The team also annotated the environments with semantic labels, such as “window” and “stairs,” that included labels for individual objects, such as book or plant. And for creating such a data set, FRL researchers used proprietary camera technology as well as a spatial AI technique that’s based on the simultaneous localization and mapping (SLAM) approaches. Replica further captures the details in the raw video, reconstructing dense 3D meshes along with high-resolution as well as high dynamic range textures. The data used for generating Replica removes any personal details including family photos that could identify an individual. The researchers had to manually fill in the small holes that are inevitably missed during scanning. They also used a 3D paint tool for applying annotations directly onto meshes. The blog reads, “Running Replica’s assets on the AI Habitat platform reveals how versatile active environments are to the research community, not just for embodied AI but also for running experiments related to CV and ML.” Habitat Challenge for the embodied platform The researchers held the Habitat Challenge in April-May this year, a competition that focused on evaluating the task of goal-directed visual navigation. The aim was to demonstrate the utility of AI Habitat’s modular approach as well as emphasis on 3D photo-realism. This challenge required participants to upload the code which was different from the traditional one where usually people upload predictions that are based on a task related to a given benchmark. Also, the code was run on new environments that their agents were not familiar with. The top-performing teams are Team Arnold (a group of researchers from CMU) and Team Mid-Level Vision (a group of researchers from Berkeley and Stanford). The blog further reads, “Though AI Habitat and Replica are already powerful open resources, these releases are part of a larger commitment to research that’s grounded in physical environments. This is work that we’re pursuing through advanced simulations, as well as with robots that learn almost entirely through unsimulated, physical training. Traditional AI training methods have a head start on embodied techniques that’s measured in years, if not decades.” To know more about this news, check out Facebook AI’s blog post. Facebook researchers show random methods without any training can outperform modern sentence embeddings models for sentence classification Facebook researchers build a persona-based dialog dataset with 5M personas to train end-to-end dialogue systems Facebook AI researchers investigate how AI agents can develop their own conceptual shared language

0
0
1898

article-image-amazon-is-being-sued-for-recording-childrens-voices-through-alexa-without-consent

Sugandha Lahoti

17 Jun 2019

5 min read

Amazon is being sued for recording children’s voices through Alexa without consent

Sugandha Lahoti

17 Jun 2019

5 min read

Last week, two lawsuits were filed in Seattle that allege that Amazon is recording voiceprints of children using its Alexa devices without their consent. This is in violation of laws governing recordings in at least eight states, including Washington. The complaint was filed on behalf of a 10-year-old Massachusetts girl on Tuesday in federal court in Seattle. Another nearly identical suit was filed the same day in California Superior Court in Los Angeles, on behalf of an 8-year-old boy. What was the complaint? Per the complaint, “Alexa routinely records and voiceprints millions of children without their consent or the consent of their parents.” The complaint notes that Alexa devices record and transmit any speech captured after a “wake word” activates the device. This is regardless of the speaker and whether that person purchased the device or installed the associated app. It alleges that Amazon saves a permanent recording of the user’s voice instead of deleting the recordings after storing them for a short time or not at all. In both cases, the children had interacted with Echo Dot speakers in their homes, and in both cases the parents claimed they had never agreed for their child's voice to be recorded. The lawsuit alleges that Amazon’s failure to obtain consent, violates the laws of Florida, Illinois, Michigan, Maryland, Massachusetts, New Hampshire, Pennsylvania and Washington, which require consent of all parties to a recording, regardless of age. Aside from “the unique privacy interest” involved in recording someone’s voice, the lawsuit says, “It takes no great leap of imagination to be concerned that Amazon is developing voiceprints for millions of children that could allow the company (and potentially governments) to track a child’s use of Alexa-enabled devices in multiple locations and match those uses with a vast level of detail about the child’s life, ranging from private questions they have asked Alexa to the products they have used in their home.” What does the lawsuit suggest Amazon should do? The plaintiffs suggest that more could be done to ensure children and others were aware of what was going on. The lawsuit claims that Amazon should inform users who had not previously consented that they were being recorded and ask for consent. It should also deactivate permanent recording for users who had not consented. The complaints also suggest that Alexa devices should be designed to only send a digital query rather than a voice recording to Amazon's servers. Alternatively, Amazon could automatically overwrite the recordings shortly after they have been processed. What is Amazon’s response? When Vox reporters asked Amazon for a comment, they wrote to them in an email, “Amazon has a longstanding commitment to preserving the trust of our customers, and we have strict measures and protocols in place to protect their security and privacy.” They also pointed to a company blog post about the FreeTime parental controls on Alexa. Per their FreeTime parental control policy, parents can review and delete their offspring's voice recordings at any time via an app or the firm's website. In addition, it says, they can contact the firm and request the deletion of their child's voice profile and any personal information associated with it. However, these same requirements do not apply to a child’s use of Alexa outside of the FreeTime service and children’s Alexa skills. Amazon’s Alexa terms of use notes, “if you do not accept the terms of this agreement, then you may not use Alexa.” However, according to Andrew Schapiro, an attorney with Quinn Emanuel Urquhart & Sullivan, one of two law firms representing the plaintiffs. “There is nothing in that agreement that would suggest that “you” means a marital community, family or household. I doubt you could even design terms of service that bind ‘everyone in your household.’” This could also mean that Alexa is storing details of everyone, and not just children. A comment on Hacker News reads, “Important to note that if this allegation is true, it means Alexa is recording everyone and storing it indefinitely, not just children. The lawsuit just says children because children have more privacy protections than adults so it's easier to win a case when children's rights are being violated.” Others also share similar opinions: https://twitter.com/_FamilyInsights/status/1140490515240165377 https://twitter.com/lewiskamb/status/1138895472351883265 However, a few don’t agree: https://twitter.com/shellypalmer/status/1139545654567559169 https://twitter.com/CarolannJacobs/status/1139165270524780554 The suit asks a judge to certify the class action and rule that Amazon violated state laws, require it to delete all recordings of class members, and prevent further recording without prior consent. It seeks damages to be determined at trial. The Seattle case seeks damages up to $100 a day and the California case wants damages of $5,000 per violation. Google Home and Amazon Alexa can no longer invade your privacy; thanks to Project Alias! US regulators plan to probe Google on anti-trust issues; Facebook, Amazon & Apple also under legal scrutiny. Amazon shareholders reject proposals to ban sale of facial recognition tech to govt and to conduct an independent review of its human and civil rights impact.

0
0
2460

article-image-google-facebook-and-twitter-submit-reports-to-eu-commission-on-progress-to-fight-disinformation

Fatema Patrawala

17 Jun 2019

6 min read

Google, Facebook and Twitter submit reports to EU Commission on progress to fight disinformation

Fatema Patrawala

17 Jun 2019

6 min read

Last Friday the European Commission published a report which detailed the progress made by Facebook, Google and Twitter in March 2019 to fight disinformation. The three online platforms are signatories to the Code of Practice against disinformation and have committed to report monthly on their actions ahead of the European Parliament elections in May 2019. https://twitter.com/jb_bax/status/1139467517007749121 https://twitter.com/jb_bax/status/1139475796425400320 The monthly reporting cycle builds on the Code of Practice, and is part of the Action Plan against disinformation. The European Union adopted this last December to build up capabilities and strengthen cooperation between Member States and EU institutions to proactively address the threats posed by disinformation. The reporting signatories committed to the Code of Practice in October 2018 on a voluntary basis. The Code aims to reach the objectives set out by the Commission's Communication presented in April 2018 by setting a wide range of commitments: Disrupt advertising revenue for accounts and websites misrepresenting information and provide advertisers with adequate safety tools and information about websites purveying disinformation. Enable public disclosure of political advertising and make effort towards disclosing issue-based advertising. Have a clear and publicly available policy on identity and online bots and take measures to close fake accounts. Offer information and tools to help people make informed decisions, and facilitate access to diverse perspectives about topics of public interest, while giving prominence to reliable sources. Provide privacy-compliant access to data to researchers to track and better understand the spread and impact of disinformation. The Commission is monitoring the progress of the platforms towards meeting the commitments that are most relevant and urgent ahead of the election campaign namely: scrutiny of ad placements; political and issue-based advertising; and integrity of services. Vice-President for the Digital Single Market Andrus Ansip, Commissioner for Justice, Consumers and Gender Equality Věra Jourová, Commissioner for the Security Union Julian King, and Commissioner for the Digital Economy and Society Mariya Gabriel progress made a joint statement welcoming the progress made by three companies: "We appreciate the efforts made by Facebook, Google and Twitter to increase transparency ahead of the European elections. We welcome that the three platforms have taken further action to fulfil their commitments under the Code. All of them have started labelling political advertisements on their platforms. In particular, Facebook and Twitter have made political advertisement libraries publicly accessible, while Google's library has entered the testing phase. This provides the public with more transparency around political ads. However, further technical improvements as well as sharing of methodology and data sets for fake accounts are necessary to allow third-party experts, fact-checkers and researchers to carry out independent evaluation. At the same time, it is regrettable that Google and Twitter have not yet reported further progress regarding transparency of issue-based advertising, meaning issues that are sources of important debate during elections. We are pleased to see that the collaboration under the Code of Practice has encouraged Facebook, Google and Twitter to take further action to ensure the integrity of their services and fight against malicious bots and fake accounts. In particular, we welcome Google increasing cooperation with fact-checking organisations and networks. Furthermore, all three platforms have been carrying out initiatives to promote media literacy and provide training to journalists and campaign staff. The voluntary actions taken by the platforms are a step forward to support transparent and inclusive elections and better protect our democratic processes from manipulation, but a lot still remains to be done. We look forward to the next reports from April showing further progress ahead of the European elections.” Google reported on specific actions taken to improve scrutiny of ad placements in the EU, including a breakdown per Member State Google gave an update on its election ads policy, which it started enforcing on 21 March 2019, and announced the launch of its EU Elections Ads Transparency Report and its searchable ad library available in April. Google reported that it took action against more than 130,000 EU-based accounts that violated its ads policies to fight misrepresentation, and almost 27,000 that violated policies on original content. The company also provided data on the removal of a significant number of YouTube channels for violation of its policies on spam, deceptive practices and scams, and impersonation. Google did not report on progress regarding the definition of issue-based advertising. Facebook reported on actions taken against ads that violated its policies for containing low quality, disruptive, misleading or false content Facebook provided information on its political ads policy, which also applies to Instagram. The company noted the launch of a new, publicly available Ad Library globally on 28 March 2019, covering Facebook and Instagram, and highlighted the expansion of access to its Ad Library application programming interface. It reported to take action on over 1.2 million accounts in the EU for violation of policies on ads and content. Facebook reported on 2.2 billion fake accounts disabled globally in Q1 of 2019 and it took down eight coordinated inauthentic behaviour networks, originating in North Macedonia, Kosovo and Russia. The report did not state whether these networks also affected users in the EU. Twitter reported an update to its political campaigning ads policy and provided details on the public disclosure of political ads in Twitter's Ad Transparency Centre Twitter provided figures on actions undertaken against spam and fake accounts, but did not provide further insights on these actions and how they relate to activity in the EU. Twitter reported on rejecting more than 6,000 ads targeted at the EU for violation of its unacceptable business practices ads policy as well as about 10,000 EU-targeted ads for violations of its quality ads policy. Twitter challenged almost 77 million spam or fake accounts. Twitter did not report on any actions to improve the scrutiny of ad placements or provide any metrics with respect to its commitments in this area. What are the next steps for the EU Commission The report covers the measures taken by online platforms in March 2019. This will allow the Commission to verify that effective policies to ensure the integrity of the electoral processes are in place before the European elections in May 2019. The Commission will carry out a comprehensive assessment of the Code's initial 12-month period by the end of 2019. If the results prove to be unsatisfactory, the Commission may propose further actions, which may be of a regulatory nature. Google and Facebook allegedly pressured and “arm-wrestled” EU expert group to soften European guidelines for fake news: Open Democracy Report Ireland’s Data Protection Commission initiates an inquiry into Google’s online Ad Exchange services Google and Binomial come together to open-source Basis Universal Texture Format

0
0
2373

article-image-facebook-signs-on-more-than-a-dozen-backers-for-its-globalcoin-cryptocurrency-including-visa-mastercard-paypal-and-uber

Bhagyashree R

14 Jun 2019

4 min read

Facebook signs on more than a dozen backers for its GlobalCoin cryptocurrency including Visa, Mastercard, PayPal and Uber

Bhagyashree R

14 Jun 2019

4 min read

Facebook has secured the backing of some really big companies including Visa, Mastercard, PayPal, and Uber for its cryptocurrency project codenamed Libra (also known as GlobalCoin), as per a WSJ report shared yesterday. Each of these companies will be investing $10 million as part of a governing consortium for the cryptocurrency independent of Facebook. According to WSJ, as a part of the governing body, these companies will be able to monitor Facebook’s payment ambitions. They will also benefit from the popularity of the currency if it takes off with Facebook’s 2.4 million monthly active users. Facebook’ GlobalCoin Despite Facebook being extremely discreet about its cryptocurrency project, many rumors have been floating about this project. The only official statement came from Laura McCracken, Facebook’s Head of Financial Services & Payment Partnerships for Northern Europe. She, in an interview with the German finance magazine Wirtschaftswoche, disclosed that the project’s white paper will be unveiled on June 18th. Other reports by the media suggest that Facebook is targeting 2020 for launching its cryptocurrency. GlobalCoin is going to be a “stablecoin”, which means it will have less price volatility as compared to other cryptocurrencies such as Ethereum and BitCoin. To provide price stability, it will be pegged to a basket of international government-issued currencies, including the U.S. dollar, euro, and Japanese yen. Facebook has spoken with various financial institutions to create a $1 billion basket of multiple international fiat currencies that will serve as a collateral to stabilize the price of the coin. “The value of Facebook Coin will be secured with a basket of fiat currencies,” McCracken told the publication. After its launch, you will be able to use GlobalCoin to make payments via Facebook’s messaging products like Messenger and WhatsApp with zero processing fees. Facebook is in talks with merchants to accept its cryptocurrency as payment and may offer sign-up bonuses. The tech giant is also reportedly looking into developing ATM-like physical terminals for people to convert their money into its cryptocurrency. What do people think about the GlobalCoin cryptocurrency? Despite a few benefits like decentralized governance, less volatility, no interchange fees many users are skeptical about this cryptocurrency, given Facebook’s reputation. Here’s what a user said in a Reddit thread, “Facebook tried and failed with credits a while back. A Facebook coin would have some use cases (sending money across borders for example), but in a day-to-day sense, having to buy credits to use isn't addressing a problem for many people. There's also going to be some people who have no desire to share what they're purchasing/doing with Facebook, especially if there isn't any significant benefit in doing so.” Here’s what some Twitter users think about Facebook’s GlobalCoin: https://twitter.com/SarahJamieLewis/status/1139429913922957312 https://twitter.com/Timccopeland/status/1137311565273862144 However, some users are also supportive of this move. “It is a good idea though because Facebook is now going to start competing with Amazon in e-commerce. Companies aren't going to just buy ads on Facebook, now they're going to directly list their items for sale on the site and consumers can be able to buy those items without ever leaving Facebook. Genius idea. They might give Amazon a run for their money,” another user commented on Reddit. Mark Zuckerberg is a liar, fraudster, unfit to be the C.E.O. of Facebook, alleges Aaron Greenspan to the UK Parliamentary Committee Austrian Supreme Court rejects Facebook’s bid to stop a GDPR-violation lawsuit against it by privacy activist, Max Schrems Zuckberg just became the target of the world’s first high profile white hat deepfake op. Can Facebook come out unscathed?

0
0
1938

article-image-introducing-voila-that-turns-your-jupyter-notebooks-to-standalone-web-applications

Bhagyashree R

13 Jun 2019

3 min read

Introducing Voila that turns your Jupyter notebooks to standalone web applications

Bhagyashree R

13 Jun 2019

3 min read

Last week, a Jupyter Community Workshop on dashboarding was held in Paris. At the workshop, several contributors came together to build the Voila package, the details of which QuantStack shared yesterday. Voila serves live Jupyter notebooks as standalone web applications providing a neat way to share your work results with colleagues. Why do we need Voila? Jupyter notebooks allow you to do something called “literature programming” in which human-friendly explanations are accompanied with code blocks. It allows scientists, researchers, and other practitioners of scientific computing to add theory behind their code including mathematical equations. However, Jupyter notebooks may prove to be a little bit problematic when you plan to communicate your results with other non-technical stakeholders. They might get put-off by the code blocks and also the need for running the notebook to see the results. It also does not have any mechanism to prevent arbitrary code execution by the end user. How Voila works? Voila addresses all the aforementioned queries by converting your Jupyter notebook to a standalone web application. After connecting to a notebook URL, Voila launches the kernel for that notebook and runs all the cells. Once the execution is complete, it does not shut down the kernel. The notebook gets converted to HTML and is served to the user. This rendered HTML includes JavaScript that is responsible for initiating a websocket connection with the Jupyter kernel. Here’s a diagram depicting how it works: Source: Jupyter Blog Following are the features Voila provides: Renders Jupyter interactive widgets: It supports Jupyter widget libraries including bqplot, ipyleafet, ipyvolume, ipympl, ipysheet, plotly, and ipywebrtc. Prevents arbitrary code execution: It does not allow arbitrary code execution by consumers of dashboards. A language-agnostic dashboarding system: Voila is built upon Jupyter standard protocols and file formats enabling it to work with any Jupyter kernel (C++, Python, Julia). Includes custom template system for better extensibility: It provides a flexible template system to produce rich application layouts. Many Twitter users applauded this new way of creating live and interactive dashboards from Jupyter notebooks: https://twitter.com/philsheard/status/1138745404772818944 https://twitter.com/andfanilo/status/1138835776828071936 https://twitter.com/ToluwaniJohnson/status/1138866411261124608 Some users also compared it with another dashboarding solution called Panel. The main difference between Panel and Voila is that Panel supports Bokeh widgets whereas Voila is framework and language agnostic. “Panel can use a Bokeh server but does not require it; it is equally happy communicating over Bokeh Server's or Jupyter's communication channels. Panel doesn't currently support using ipywidgets, nor does Voila currently support Bokeh plots or widgets, but the maintainers of both Panel and Voila have recently worked out mechanisms for using Panel or Bokeh objects in ipywidgets or using ipywidgets in Panels, which should be ready soon,” a Hacker News user commented. To read more in detail about Voila, check out the official announcement on the Jupyter Blog. JupyterHub 1.0 releases with named servers, support for TLS encryption and more Introducing Jupytext: Jupyter notebooks as Markdown documents, Julia, Python or R scripts JupyterLab v0.32.0 releases

0
0
10915

article-image-have-i-been-pwned-up-for-acquisition-troy-hunt-code-names-this-campaign-project-svalbard

Savia Lobo

12 Jun 2019

4 min read

‘Have I Been Pwned’ up for acquisition; Troy Hunt code names this campaign ‘Project Svalbard’

Savia Lobo

12 Jun 2019

4 min read

Yesterday, Troy Hunt, revealed that his ‘Have I Been Pwned’(HIBP) website is up for sale, on his blogpost. Hunt has codenamed this acquisition as Project Svalbard and is working with KPMG to find a buyer. [box type="shadow" align="" class="" width=""]Troy Hunt has named Project Svalbard after the Svalbard Global Seed Vault, which is a secure seed bank on the Norwegian island of Spitsbergen. This vault represents the world’s largest collection of crop diversity with a long-term seed storage facility, for worst-case scenarios such as natural or man-made disasters.[/box] Commercial subscribers highly depend on HIBP to alert members of identity theft programs, enable infosec companies, provide services to their customers, protect large online assets from credential stuffing attacks, preventing fraudulent financial transactions and much more. Also, governments around the world and the law enforcement agencies use HIBP to protect their departments and also for their investigations respectively. Hunt further says he has been handling everything alone. “to date, every line of code, every configuration and every breached record has been handled by me alone. There is no “HIBP team”, there’s one guy keeping the whole thing afloat”, he writes. However, in January, this year he discovered Collection #1 data breach which included 87 GB worth of data in a folder containing 12,000-plus files, nearly 773 email addresses, and more than 21 million unique passwords from data breaches going back to 2008. Hunt uploaded all of these breached data to HIBP and since then he says the site has seen a massive influx in activity, thus, taking him away from other responsibilities. “The extra attention HIBP started getting in Jan never returned to 2018 levels, it just kept growing and growing,” he says. Hunt said he was concerned about burnout, given the increasing scale and incidence of data breaches. Following this, he said it was time for HIBP to “grow up”. He also believed HIBP could do more in the space, including widening its capture of breaches. https://twitter.com/troyhunt/status/1138322112224083968 “There's a whole heap of organizations out there that don't know they've been breached simply because I haven't had the bandwidth to deal with it all,” Hunt said. “There's a heap of things I want to do with HIBP which I simply couldn't do on my own. This is a project with enormous potential beyond what it's already achieved and I want to be the guy driving that forward,” Hunt wrote. Hunt also includes a list of “commitments for the future of HIBP” in his blogpost. He also said he intended to be “part of the acquisition - that is some company gets me along with the project” and that “freely available consumer searches should remain freely available”. Via Project Svalbard, Hunt hopes to enable HIBP to reach out to more and more people and play “a much bigger role in changing the behavior of how people manage their online accounts.” A couple of commenters on the blog post ask Hunt whether he’s considered/approached Mozilla as a potential owner. In a reply to one he writes,“Being a party that’s already dependent on HIBP, I reached out to them in advance of this blog post and have spoken with them. I can’t go into more detail than that just now, but certainly their use of the service is enormously important to me.” To know more about this announcement in detail, read Troy Hunt’s official blogpost. A security researcher reveals his discovery on 800+ Million leaked Emails available online The Collections #2-5 leak of 2.2 billion email addresses might have your information, German news site, Heise reports Bo Weaver on Cloud security, skills gap, and software development in 2019

0
0
2746

Vincy Davis

12 Jun 2019

6 min read

Zuckerberg just became the target of the world's first high profile white hat deepfake op. Can Facebook come out unscathed?

Vincy Davis

12 Jun 2019

6 min read

Yesterday, Motherboard reported that a fake video of Mark Zuckerberg was posted on Instagram, under the username, bill_posters_uk. In the video, Zuckerberg appears to give a threatening speech about the power of Facebook. https://twitter.com/motherboard/status/1138536366969688064 Motherboard mentions that the video has been created by artists Bill Posters and Daniel Howe in partnership with advertising company Canny. Previously, Canny in partnership with Posters has created several such deepfake videos of Donald Trump, Kim Kardashian etc. Omer Ben-Ami, one of the founders of Canny says that the video is made to educate the public on the uses of AI and to make them realize the potential of AI. But according to other news sources the video created by the artists is to test Facebook’s no takedown policy on fake videos and misinformation for the sake of retaining their “educational value”. Recently Facebook received strong criticism for promoting fake videos on its platform. In May, the company had refused to remove a doctored video of senior politician Nancy Pelosi. Neil Potts, Public Policy Director of Facebook had stated that if someone posted a doctored video of Zuckerberg, like one of Pelosi, it would stay up. Around the same time, Monika Bickert, vice president for Product Policy And Counterterrorism at Facebook had said for the fake video of Nancy Pelosi,“Anybody who is seeing this video in News Feed, anyone who is going to share it to somebody else, anybody who has shared it in the past, they are being alerted that this video is false”. Bickert also added that, “And this is part of the way that we deal with misinformation.” Following all of this it seems that the stance on Mark Zuckerberg’s fake video went through a test and it passed. As Instagram spokesperson comments that it will stay up on the platform but will be removed from recommendation surface. “We will treat this content the same way we treat all misinformation on Instagram,” a spokesperson for Instagram told Motherboard. “If third-party fact-checkers mark it as false, we will filter it from Instagram’s recommendation surfaces like Explore and hashtag pages.” The fake Mark Zuckerberg video is a short one in which he talks about Facebook’s power, the video says, “Imagine this for a second: One man, with total control of billions of people's stolen data, all their secrets, their lives, their futures, I owe it all to Spectre. Spectre showed me that whoever controls the data, controls the future.” The video is also framed with broadcast chyrons which reads, “Zuckerberg: We're increasing transparency on ads. Announces new measures to protect elections.” This was created in order to make it appear like a usual news report. [box type="shadow" align="" class="" width=""]As the video is fake and unauthentic, we have not added link to the video in our article.[/box] The audio in the video sounds much like a voiceover, but without any sync issues and is loud and clear. However the visuals are almost accurate. In this deepfake video, the person shown can blink, move seamlessly and also gesture the way Zuckerberg would do. Motherboard reports that the visuals in the video are taken from a real video of Zuckerberg in September 2017, when he was addressing the Russian election interference on Facebook. The Instagram post containing the video, stated that it’s created using CannyAI's video dialogue replacement (VDR) technology. In a statement to Motherboard, Omer Ben-Ami, said that for the Mark Zuckerberg deepfake, “Canny engineers arbitrarily clipped a 21-second segment out of the original seven minute video, trained the algorithm on this clip as well as videos of the voice actor speaking, and then reconstructed the frames in Zuckerberg's video to match the facial movements of the voice actor.” Omer also mentions that “the potential of AI lies in the ability of creating a photorealistic model of a human being. It is the next step in our digital evolution where eventually each one of us could have a digital copy, a Universal Everlasting human. This will change the way we share and tell stories, remember our loved ones and create content” A CNN reporter has tweeted that CBS is asking Facebook to remove the fake Zuckerberg video because it shows the CBS logo on it, “CBS has requested that Facebook take down this fake, unauthorized use of the CBSN trademark”. Apparently the fake video of Zuckerberg has garnered some good laughs among the community. It is also seen as a next wave in the battle to fight misinformation on social media sites. A user on Hacker News says, “I love the concept of this. There's no better way to put Facebook's policy to the test than to turn it against them.” https://twitter.com/jason_koebler/status/1138515287853228032 https://twitter.com/ezass/status/1138592610363174913 But many users are also concerned that if a fake video can look so accurate now, it’s going to be a challenge to identify which information is true and which is false. A user on Reddit comments that “This election cycle will be a dry run for the future. Small ads, little bits of phrases and speeches will stream across social media. If it takes hold, I fear for the future. We will find it very, very difficult to know what is real without a large social change, as large as the advent of social media in the first place.” Another user adds “I'm routinely surprised by the number of people unaware just how far this technology has progressed just in the past three years, as well as how many people are completely unaware it exists at all. At this point, I think that's scarier than the tech itself.” And another one comments, “True. Also the older generation. I can already see my grandpa seeing a deepfake on Fox news and immediately considering it gospel without looking into it further.” US regulators plan to probe Google on anti-trust issues; Facebook, Amazon & Apple also under legal scrutiny Facebook argues it didn’t violate users’ privacy rights and thinks there’s no expectation of privacy because there is no privacy on social media Google and Facebook allegedly pressured and “arm-wrestled” EU expert group to soften European guidelines for fake news: Open Democracy Report

0
0
2420

article-image-amazon-announces-general-availability-of-amazon-personalize-an-ai-based-recommendation-service

Vincy Davis

12 Jun 2019

3 min read

Amazon announces general availability of Amazon Personalize, an AI-based recommendation service

Vincy Davis

12 Jun 2019

3 min read

Two days ago, Amazon Web Services (AWS) announced in a press release that Amazon Personalize will now be generally available to all customers. Until now, this machine learning technology was used by Amazon.com for AWS customers to use in their applications. https://twitter.com/jeffbarr/status/1138430113589022721 Amazon Personalize basically helps developers easily add custom machine learning models into their applications, such as personalized product and content recommendations, tailored search results, and targeted marketing promotions, even if they don’t have machine learning experience. It is a fully managed service that trains, tunes, and deploys custom, private machine learning models. Customers have to pay for only what they use, with no minimum fees or upfront commitments. Amazon has been using Personalize for processing and examining the data, to identify what is meaningful, to select from multiple built advanced algorithms for their retail business, and for training and optimizing a personalization model customized to their data. All this is done, while keeping the customers data completely private. Customers receive results via an Application Programming Interface (API). Now,with the general availability of Amazon Personalize, many application developers and data scientists at businesses of all sizes across all industries, will be able to use and implement the power of Amazon’s expertise in machine learning. Swami Sivasubramanian, Vice President of Machine Learning, Amazon Web Services said “Customers have been asking for Amazon Personalize, and we are eager to see how they implement these services to delight their own end users. And the best part is that these artificial intelligence services, like Amazon Personalize, do not require any machine learning experience to immediately train, tune, and deploy models to meet their business demands”. Amazon Personalize will now be available in US East (Ohio), US East (N. Virginia), US West (Oregon), Asia Pacific (Tokyo), Asia Pacific (Singapore) and EU (Ireland). Amazon charges five cents per GB of data uploaded to Personalize and 24 cents per training hour used to train a custom model. Real-time recommendation requests are priced based on how many requests are uploaded, with discounts for larger orders. Customers who have already added Amazon Personalize to their apps include Yamaha Corporation of America, Subway, Zola and Segment. In the press release, Ishwar Bharbhari, Director of Information Technology, Yamaha Corporation of America, said “Amazon Personalize saves us up to 60% of the time needed to set up and tune the infrastructure and algorithms for our machine learning models when compared to building and configuring the environment on our own. It is ideal for both small developer teams who are trying to build the case for ML and large teams who are trying to iterate rapidly at reasonable cost. Even better, we expect Amazon Personalize to be more accurate than other recommender systems, allowing us to delight our customers with highly personalized product suggestions during their shopping experience, which we believe will increase our average order value and the total number of orders”. Developers are of course excited, that they can finally implement Amazon Personalize in their applications. https://twitter.com/TheNickWalsh/status/1138243004127334400 https://twitter.com/SubkrishnaRao/status/1138742140996112384 https://twitter.com/PatrickMoorhead/status/1138228634924212229 To get started with Amazon Personalize, head over to this blog post by Julien Simon. Amazon re:MARS Day 1 kicks off showcasing Amazon’s next-gen AI robots; Spot, the robo-dog and a guest appearance from ‘Iron Man’ US regulators plan to probe Google on anti-trust issues; Facebook, Amazon & Apple also under legal scrutiny World’s first touch-transmitting telerobotic hand debuts at Amazon re:MARS tech showcase

0
0
2033

article-image-youtube-ceo-susan-wojcicki-says-reviewing-content-before-upload-isnt-a-good-idea

Fatema Patrawala

12 Jun 2019

8 min read

YouTube CEO, Susan Wojcicki says reviewing content before upload isn’t a good idea

Fatema Patrawala

12 Jun 2019

8 min read

This year’s Recode annual Code Conference commenced on Monday, 10th June at Arizona. YouTube CEO, Susan Wojcicki was interviewed yesterday by Peter Kafka, Recode’s senior correspondent and there were some interesting conversations between the two. https://twitter.com/pkafka/status/1137815486387802112 The event is held for two days covering interviews from the biggest names in the tech business. The talks will include the sessions from the below representatives: YouTube CEO Susan Wojcicki Facebook executives Adam Mosseri and Andrew Bosworth Amazon Web Services CEO Andy Jassy Fair Fight founder Stacey Abrams Netflix vice president of original content Cindy Holland Russian Doll star Natasha Lyonne Medium CEO Ev Williams Harley Davidson President and CEO, Matthew Levatich The Uninhabitable Earth author David Wallace Wells This year as Code brings the most powerful people in tech, it also defines the manifesto for the event - as Reckoning. Reckoning means “the avenging or punishing of past mistakes or misdeeds.” As it believes that this year there is no word better which captures the state of mind of those in Big tech today. With some of the incalculable mistakes which these companies have gotten themselves into to build up the internet. This is the focus of this years Code Con 2019. In one of the biggest takeaways from Susan Wojcicki’s interview, she said she is okay with taking content down, but she doesn’t think it’s a good idea to review it before it goes up on the massive video-sharing platform. “I think we would lose a lot of voices,” Wojcicki said. “I don’t think that’s the right answer.” Peter asked her about the updated hate speech policy which YouTube announced last week. Susan in response said it was a very important decision for YouTube and they had been working for it since months. The company updated its hate speech policy according to which it will take down “videos alleging that a group is superior in order to justify discrimination, segregation or exclusion based on qualities like age, gender, race, caste, religion, sexual orientation or veteran status.” The policy directly mentioned removing videos that promote neo-Nazi content or videos that deny commonly accepted violent events, like the Holocaust or the Sandy Hook school shooting. The conversation also went in the direction of last week’s incidence, when YouTube decided that Steven Crowder wasn’t violating its rules when he kept posting videos with homophobic slurs directed at Vox journalist Carlos Maza, and the company eventually demonetized Crowder’s channel. Kafka pointed out that the company is making such decisions, but only after content is online on YouTube’s platform. In response, Wojcicki emphasized the importance of reviewing content after it publishes on the site. “We see all these benefits of openness, but we also see that that needs to be married with responsibility,” she said. She also added that the decision that Crowder’s videos did not violate the policy was hurtful to the LGBTQ community. That was not our intention and we are really sorry about it. However, she did not commit any further action in this case. https://twitter.com/voxdotcom/status/1138236928682336256 Wojcicki admitted that there will likely always be content on YouTube that violates its policies. “At the scale that we’re at, there are always gonna be people who want to write stories,” she said, suggesting that journalists will always choose to focus on the negative aspects of YouTube in their reporting. “We have lots of content that’s uploaded and lots of users and lots of really good content. When we look at it, what all the news and the concerns and stories have been about is this fractional 1 percent,” Wojcicki said. “If you talk about what the other 99 points whatever that number is that’s all really valuable content.” “Yes, while there may be something that slips through or some issue, we’re really working hard to address this,” she said. https://twitter.com/Recode/status/1138229101834133510 Wojcicki suggested that instead of approving videos ahead of time, using tiers in which creators get certain privileges over time. This means more distribution and monetization of their content. “I think this idea of like not everything is automatically given to you on day one, that it’s more of a — we have trusted tiers,” she said. Wojcicki then discussed about YouTube limiting recommendations, comments, and sharing, and it has reduced views of white supremacist videos by 80 percent since 2017. She mentioned that it has only now banned that content altogether. And YouTube is one of several prominent tech companies trying to figure out how to deal with hateful content proliferating on their platforms. Later Wojcicki shifted the focus of conversation on the improvements YouTube has made in the past few years. “Two years ago there were a lot of articles, a lot of concerns about how we handle violent extremism. If you talk to people who are experts in this field, you can see that we’ve made tremendous progress.” “We have a lot of tools, we work hard to understand what is happening on it and really work hard to enforce the work that we’re doing. I think if you look across the work you can see we’ve made tremendous progress in a number of these areas,” Wojcicki said. “If you were to fast-forward a couple years and say, well, what that would look like in 12 months and then in another 12 months, what are all the different tools that have been built, I think you’ll see there will be a lot of progress.” There were questions from the audience as well and one of them asked Wojcicki, “You started off with an apology to LGBT community but then you also said that you were involved in it and you think YouTube made the right call as to why people don't feel like that's an apology and concerned that YouTube flags LGBT and positive content just for being LGBT and sometimes being sensitive and yet slurs are allowed so I am curious to know if you really sorry for anything that happened to the LGBTQ community? Or are you just sorry they were offended?" Susan said she is personally really sorry for what happened and speaking up for the company they had not intended to do that and the company is also sorry for the hurt that had caused. Susan then continued to give a vague response and did not really answer the question of why LGBT positive content get flagged as sensitive but slurs against the LGBT community is allowed on Youtube. She reiterated the policy and tried to play around by saying we have people who belong to that community and we support them, and that it was a “hard” decision. “So I am really personally really sorry and that was not our intent. YouTube has always been the home of so many LGBTQ creators and that was why it was so emotional and that’s why I think this really even though it was a hard decision it was made harder for us and YouTube has so many people from the LGBTQ and we have always wanted to support the community openly in spite of this hard issue that we have had right now and people had this criticism on why you still and why you changed your logo to rainbows even though you made this hard decision and because as a company we really wanna support the community its just that from a policy standpoint we need to be consistent because if we took down that content there would be so much other content that we would need to take down and we don't want just to be knee jerk and we need to think about it in a very thoughtful way. We will speak to people from LGBTQ community and make sure that we are incorporating that going forward in terms of how think about harassment and make sure that we are implementing that in a fair and consistent way going forward. And I think it was a hard week and I am truly sorry for the hurt that we have caused to the community. It was not our intention at all and I do want to say that many changes we made to the hate policy are really going to be beneficial to the community. There are a lot of videos there and there are a lot of ways the community is attacked and we will be taking down those videos going forward and we will be very consistent and if we see such videos we will take them down” https://twitter.com/Recode/status/1138231255445598208 The community is also unhappy with her response and says this is not the answer to the question and she has tried to deny the responsibility. https://twitter.com/VickerySec/status/1138657458182770688 https://twitter.com/Aleen/status/1138246902539886592 Facebook argues it didn’t violate users’ privacy rights and thinks there’s no expectation of privacy because there is no privacy on social media Time for data privacy: DuckDuckGo CEO Gabe Weinberg in an interview with Kara Swisher Privacy Experts discuss GDPR, its impact, and its future on Beth Kindig’s Tech Lightning Rounds Podcast

0
0
1980

article-image-google-researchers-present-zanzibar-a-global-authorization-system-it-scales-trillions-of-access-control-lists-and-millions-of-authorization-requests-per-second

Amrata Joshi

11 Jun 2019

6 min read

Google researchers present Zanzibar, a global authorization system, it scales trillions of access control lists and millions of authorization requests per second

Amrata Joshi

11 Jun 2019

6 min read

Google researchers presented a paper on Google’s consistent global authorization system known as Zanzibar. The paper focuses on the design, implementation, and deployment of Zanzibar for storing and evaluating access control lists (ACL). Zanzibar offers a uniform data model and configuration language for providing a wide range of access control policies from hundreds of client services at Google. The client services include Cloud, Drive, Calendar, Maps, YouTube and Photos. Zanizibar authorization decisions respect causal ordering of user actions and thus provide external consistency amid changes to access control lists and object contents. It scales to trillions of access control lists and millions of authorization requests per second to support services used by billions of people. It has maintained 95th-percentile latency of less than 10 milliseconds and availability of greater than 99.999% over 3 years of production use. Here’s a list of the authors who contributed to the paper, Ruoming Pang, Ramon C ´aceres, Mike Burrows, Zhifeng Chen, Pratik Dave, Nathan Germer, Alexander Golynski, Kevin Graney, Nina Kang, Lea Kissner, Jeffrey L. Korn, Abhishek Parmar, Christopher D. Richards and Mengzhi Wang. What are the goals of Zanzibar system Researchers have certain goals for the Zanzibar system which are as follows: Correctness: The system must ensure consistency of access control decisions. Flexibility: Zanzibar system should also support access control policies for consumer and enterprise applications. Low latency: The system should quickly respond because authorization checks are usually in the critical path of user interactions. And low latency is important for serving search results that often require tens to hundreds of checks. High availability: Zanzibar system should reliably respond to requests Because in the absence of explicit authorization, client services would be forced to deny their user access. Large scale: The system should protect billions of objects that are shared by billions of users. The system should be deployed around the globe so that it becomes easier for its clients and the end users. To achieve the above-mentioned goals, Zanzibar involves a combination of features. For example, for flexibility, the system pairs a simple data model with a powerful configuration language that allows clients to define arbitrary relations between users and objects. The Zanzibar system employs an array of techniques for achieving low latency and high availability and for consistency, it stores the data in normalized forms. Zanzibar replicates ACL data across multiple data centers The Zanzibar system operates at a global scale and stores more than two trillion ACLs (Access Control Lists) and also performs millions of authorization checks per second. But the ACL data does not lend itself to geographic partitioning as the authorization checks for an object can actually come from anywhere in the world. This is the reason why, Zanzibar replicates all of its ACL data in multiple geographically distributed data centers and then also distributes the load across thousands of servers around the world. Zanzibar’s architecture includes a main server organized in clusters Image source: Zanzibar: Google’s Consistent, Global Authorization System The acl servers are the main server type in this system and they are organized in clusters so that they respond to Check, Read, Expand, and Write requests. When the requests arrive at any server in a cluster, the server passes on the work to other servers in the cluster and those servers may then contact other servers for computing intermediate results. The initial server is the one that gathers the final result and returns it to the client. The Zanzibar system stores the ACLs and their metadata in Spanner databases. There is one database for storing relation tuples for each client namespace and one database for holding all namespace configurations. And there is one changelog database that is shared across all namespaces. So the acl servers basically read and write those databases while responding to client requests. Then there are a specialized server type that respond to Watch requests, they are known as the watchservers. These servers tail the changelog and serve namespace changes to clients in real time. The Zanzibar system runs a data processing pipeline for performing a variety of offline functions across all Zanzibar data in Spanner. For example, producing dumps of the relation tuples in each namespace at a known snapshot time. Zanzibar uses an indexing system for optimizing operations on large and deeply nested sets, known as Leopard. It is responsible for reading periodic snapshots of ACL data and for watching the changes between snapshots. It also performs transformations on data, such as denormalization, and then responds to requests coming from acl servers. The researchers concluded by stating that Zanzibar system is simple, flexible data model and offers configuration language support. According to them, Zanzibar’s external consistency model allows authorization checks to be evaluated at distributed locations without the need for global synchronization. It also offers low latency, scalability, and high availability. People are finding this paper very interesting and also the facts involved are surprising for them. A user commented on HackerNews, “Excellent paper. As someone who has worked with filesystems and ACLs, but never touched Spanner before.” Another user commented, “What's interesting to me here is not the ACL thing, it's how in a way 'straight forward' this all seems to be.” Another comment reads, “I'm surprised by all the numbers they give out: latency, regions, operation counts, even servers. The typical Google paper omits numbers on the Y axis of its most interesting graphs. Or it says "more than a billion", which makes people think "2B", when the actual number might be closer to 10B or even higher.” https://twitter.com/kissgyorgy/status/1137370866453536769 https://twitter.com/markcartertm/status/1137644862277210113 Few others think that the name of the project wasn’t Zanzibar initially and it was called ‘Spice’. https://twitter.com/LeaKissner/status/1136691523104280576 To know more about this system, check out the paper Zanzibar: Google’s Consistent, Global Authorization System. Google researchers propose building service robots with reinforcement learning to help people with mobility impairment Researchers propose a reinforcement learning method that can hack Google reCAPTCHA v3 Researchers input rabbit-duck illusion to Google Cloud Vision API and conclude it shows orientation-bias

0
0
4005

article-image-tensorflow-2-0-beta-releases-with-distribution-strategy-api-freeze-easy-model-building-with-keras-and-more

Vincy Davis

10 Jun 2019

5 min read

TensorFlow 2.0 beta releases with distribution strategy, API freeze, easy model building with Keras and more

Vincy Davis

10 Jun 2019

5 min read

After all the hype and waiting, Google has finally announced the beta version of TensorFlow 2.0. The focus feature is the tf.distribute.Strategy which distributes training across multiple GPUs, multiple machines or TPUs with minimal code changes. TensorFlow 2.0 beta version also has a number of major improvements, breaking changes and multiple bug fixes. Earlier this year, the TensorFlow team had updated the users on what to expect from TensorFlow 2.0. The 2.0 API is final with the symbol renaming/deprecation changes completed. The 2.0 API is ready and available as part of the TensorFlow 1.14 release in compat.v2 module. TensorFlow 2.0 support for Keras features Distribution Strategy for hardware The tf.distribute.Strategy supports multiple user segments, including researchers, ML engineers, etc. It also provides good performance and easy switching between strategies. Users can use the tf.distribute.Strategy API to distribute training across multiple GPUs, multiple machines or TPUs. Users can distribute their existing models and training code with minimal code changes. The tf.distribute.Strategy can be used with: TensorFlow's high level APIs Tf.keras Tf.estimator Custom training loops TenserFlow 2.0 beta also simplifies the API for custom training loops. This is also based on the distribution strategy - tf.distribute.Strategys. Custom training loops give flexibility and a greater control on training. It is also easier to debug the model and the training loop. Model Subclassing Building a fully-customizable model by subclassing tf.keras.Model, allows user to define its own forward pass. Layers can be created in the __init__ method and set them as attributes of the class instance. The forward pass is defined in the call method. Model subclassing is particularly useful when eager execution is enabled, because it allows the forward pass to be written imperatively. Model subclassing gives greater flexibility when creating models that are not easily expressible. Breaking Changes The tf.contrib has been deprecated and its functionality has been migrated to the core TensorFlow API, to tensorflow/addons or removed entirely. In the tf.estimator.DNN/Linear/DNNLinearCombined family, the premade estimators have been updated to use the tf.keras.optimizers instead of the tf.compat.v1.train.OptimizerS. A checkpoint converter tool, for converting optimizers has also been included with this release. Bug Fixes and Other Changes This beta version of 2.0 includes many bug fixes and other changes. Some of them are mentioned below: In the tf.data.Options, the experimental_numa_aware option has been removed and a support for TensorArrays has been added. The tf.keras.estimator.model_to_estimator now supports exporting to tf.train.Checkpoint format. This allows the saved checkpoints to be compatible with model.load_weights. The tf.contrib.estimator.add_metrics has been replaced with tf.estimator.add_metrics. Gradient for SparseToDense op, GPU implementation of tf.linalg.tridiagonal_solve, broadcasting support to tf.matmul has been added. This beta version also exposes a flag that allows the number of threads to vary across Python benchmarks. The unused StringViewVariantWrapper and the tf.string_split from v2 API has been removed. The TensorFlow team has provided a TF 2.0 Testing User Group to users for any snag experience and for feedback purpose. General reaction to the release of TensorFlow 2.0 beta is positive. https://twitter.com/markcartertm/status/1137238238748266496 https://twitter.com/tonypeng_Synced/status/1137128559414087680 A user on reddit comments, “Can't wait to try that out !” However some users have compared it to PyTorch calling it more comprehensive than TensorFlow. PyTorch provides a more powerful platform for research and is good for production. A user on Hacker News comments, “Maybe I'll give TF another try, but right now I'm really liking PyTorch. With TensorFlow I always felt like my models were buried deep in the machine and it was very hard to inspect and change them, and if I wanted to do something non-standard it was difficult even with Keras. With PyTorch though, I connect things however how I want, write whatever training logic I want, and I feel like my model is right in my hands. It's great for research and proofs-of-concept. Maybe for production too.” Another user says that “Might give it another try, but my latest incursion in the Tensorflow universe did not end pleasantly. I ended up recording everything in Pytorch, took me less than a day to do the stuff that took me more than a week in TF. One problem is that there are too many ways to do the same thing in TF and it's hard to transition from one to the other.” The TensorFlow team hopes to resolve all the additional issues before the release candidate (RC) 2.0 version, including complete Keras model support on Cloud TPUs and TPU pods and improve the overall performance of 2.0. The RC release is expected sometime this summer. Introducing TensorFlow Graphics packed with TensorBoard 3D, object transformations, and much more Horovod: an open-source distributed training framework by Uber for TensorFlow, Keras, PyTorch, and MXNet ML.NET 1.0 RC releases with support for TensorFlow models and much more!

0
0
2958

article-image-google-research-football-environment-a-reinforcement-learning-environment-for-ai-agents-to-master-football

Amrata Joshi

10 Jun 2019

4 min read

Google Research Football Environment: A Reinforcement Learning environment for AI agents to master football

Amrata Joshi

10 Jun 2019

4 min read

Last week, Google researchers announced the release of Google Research Football Environment, a reinforcement learning environment where agents can master football. This environment comes with a physics-based 3D football simulation where agents control either one or all football players on their team, they learn how to pass between them, and further manage to overcome their opponent’s defense to score goals. The Football Environment offers a game engine, a set of research problems called Football Benchmarks and Football Academy and much more. The researchers have released a beta version of open-source code on Github to facilitate the research. Let’s have a brief look at each of the elements in the Google Research Football Environment. Football engine: The core of the Football Environment Based on the modified version of Gameplay Football, the Football engine simulates a football match including fouls, goals, corner and penalty kicks, and offsides. The engine is programmed in C++, which allows it to run with GPU as well as without GPU-based rendering enabled. The engine allows learning from different state representations that contain semantic information such as the player’s locations and learning from raw pixels. The engine can be run in both stochastic mode as well as deterministic mode for investigating the impact of randomness. The engine is also compatible with OpenAI Gym API. Read Also: Create your first OpenAI Gym environment [Tutorial] Football Benchmarks: Learning from the actual field game The researchers propose a set of benchmark problems for RL research based on the Football Engine with the help of Football Benchmarks. These benchmarks highlight the goals such as playing a “standard” game of football against a fixed rule-based opponent. The researchers have provided three versions, the Football Easy Benchmark, the Football Medium Benchmark, and the Football Hard Benchmark, which differ only in the strength of the opponent. They also provide benchmark results for two state-of-the-art reinforcement learning algorithms including DQN and IMPALA that can be run in multiple processes on a single machine or concurrently on many machines. Image Source: Google’s blog post These results indicate that the Football Benchmarks are research problems that vary in difficulties. According to the researchers, the Football Easy Benchmark is suitable for research on single-machine algorithms and Football Hard Benchmark is challenging for massively distributed RL algorithms. Football Academy: Learning from a set of difficult scenarios Football Academy is a diverse set of scenarios of varying difficulty that allow researchers to look into new research ideas and allow testing of high-level concepts. It also provides a foundation for investigating curriculum learning, research ideas, where agents can learn harder scenarios. The official blog post states, “Examples of the Football Academy scenarios include settings where agents have to learn how to score against the empty goal, where they have to learn how to quickly pass between players, and where they have to learn how to execute a counter-attack. Using a simple API, researchers can further define their own scenarios and train agents to solve them.” Users are giving mixed reaction to this news as some find nothing new in Google Research Football Environment. A user commented on HackerNews, “I guess I don't get it... What does this game have that SC2/Dota doesn't? As far as I can tell, the main goal for reinforcement learning is to make it so that it doesn't take 10k learning sessions to learn what a human can learn in a single session, and to make self-training without guiding scenarios feasible.” Another user commented, “This doesn't seem that impressive: much more complex games run at that frame rate? FIFA games from the 90s don't look much worse and certainly achieved those frame rates on much older hardware.” While a few others think that they can learn a lot from this environment. Another comment reads, “In other words, you can perform different kinds of experiments and learn different things by studying this environment.” Here’s a short YouTube video demonstrating Google Research Football. https://youtu.be/F8DcgFDT9sc To know more about this news, check out Google’s blog post. Google researchers propose building service robots with reinforcement learning to help people with mobility impairment Researchers propose a reinforcement learning method that can hack Google reCAPTCHA v3 Researchers input rabbit-duck illusion to Google Cloud Vision API and conclude it shows orientation-bias

0
0
4307

article-image-microsoft-quietly-deleted-10-million-faces-from-ms-celeb-the-worlds-largest-facial-recognition-database

Fatema Patrawala

07 Jun 2019

4 min read

Microsoft quietly deleted 10 million faces from MS Celeb, the world’s largest facial recognition database

Fatema Patrawala

07 Jun 2019

4 min read

Yesterday the Financial Times reported that Microsoft has quietly deleted its facial recognition database. More than 10 million images that were reportedly being used by companies to test their facial recognition software has been deleted. The database known as MS Celeb, was the largest public facial recognition dataset in the world. The data was amassed by scraping images off the web under a Creative Commons license that allows academic reuse of photos. According to Microsoft Research’s paper on the database, it was originally designed to train tools for image captioning and news video analysis. The existence of this database was revealed by Adam Harvey, a Berlin-based artist and researcher. Harvey’s team investigates the ethics, origins, and individual privacy implications of face recognition image datasets and their role in the expansion of biometric surveillance technologies. The Financial Times ran an in-depth investigation that revealed that giant tech companies like IBM and Panasonic, and Chinese firms such as SenseTime and Megvii, as well as military researchers, were using the massive database to test their facial recognition software. And now Microsoft has quietly taken MS Celeb down. “The site was intended for academic purposes,” Microsoft told FT.com, explaining that they had deleted it, because “it was run by an employee that is no longer with Microsoft and has since been removed.” Microsoft itself has used the data set to train facial recognition algorithms, Mr Harvey’s investigation found. The company named the data set “Celeb” to indicate that the faces it had scraped were photos of public figures. But Mr Harvey found that the dataset also included several arguably private individuals, including security journalists such as Kim Zetter, Adrian Chen and Shoshana Zuboff, the author of Surveillance Capitalism, and Julie Brill, the former FTC commissioner responsible for protecting consumer privacy. “Microsoft has exploited the term ‘celebrity’ to include people who merely work online and have a digital identity,” said Mr Harvey. “Many people in the target list are even vocal critics of the very technology Microsoft is using their name and biometric information to build.” Tech experts have also anticipated that Microsoft might have deleted the data due to the violation of the EU’s General Data Protection Law by continuing to distribute the MS Celeb dataset after the regulations came into effect last year. But Microsoft said it was not aware of any GDPR implications and that the site had been retired “because the research challenge is over”. Engadget also reported that after the FT‘s investigation, datasets built by researchers at Duke University and Stanford University were also taken down. According to Fast Company, last year Microsoft’s president, Brad Smith, spoke about fears of such technology that is creeping into everyday life and eroding our civil liberties along the way. It also turned down a facial recognition contract with California law enforcement on human rights grounds. While it may claim it wants regulation for facial recognition, but it may also want to use facial recognition technology to sell items listed on its grocery app Kroger and has eluded privacy-related scrutiny for years. Although the database has been deleted, it is still available to researchers and companies that had previously downloaded it. Once the dataset has been posted online, and people download it, it does exist with them. https://twitter.com/jacksohne/status/1136975380387172355 And now that it is completely free from any licensing, rules or controls which Microsoft previously owned. People are posting it on GitHub, hosting the files on Dropbox and Baidu Cloud, and there is no way from stopping them to continue to post it and use it for their own purposes. https://twitter.com/sedyst/status/1136735995284660224 Microsoft Build 2019: Microsoft showcases new updates to MS 365 platform with focus on AI and developer productivity Microsoft open sources SPTAG algorithm to make Bing smarter! Introducing Minecraft Earth, Minecraft’s AR-based game for Android and iOS users

0
0
2935

article-image-youtubes-new-policy-to-fight-online-hate-and-misinformation-misfires-due-to-poor-execution-as-usual

Fatema Patrawala

07 Jun 2019

9 min read

YouTube’s new policy to fight online hate and misinformation misfires due to poor execution, as usual

Fatema Patrawala

07 Jun 2019

9 min read

0
0
1837

Tech News - Data

Machine learning experts on how we can use machine learning to mitigate and adapt to the changing climate

Facebook researchers open-source AI Habitat for embodied AI research and introduced Replica, a dataset of indoor space reconstructions

Amazon is being sued for recording children’s voices through Alexa without consent

Google, Facebook and Twitter submit reports to EU Commission on progress to fight disinformation

Facebook signs on more than a dozen backers for its GlobalCoin cryptocurrency including Visa, Mastercard, PayPal and Uber

Introducing Voila that turns your Jupyter notebooks to standalone web applications

‘Have I Been Pwned’ up for acquisition; Troy Hunt code names this campaign ‘Project Svalbard’

Zuckerberg just became the target of the world's first high profile white hat deepfake op. Can Facebook come out unscathed?

Amazon announces general availability of Amazon Personalize, an AI-based recommendation service

YouTube CEO, Susan Wojcicki says reviewing content before upload isn’t a good idea

Trending Topics

Google researchers present Zanzibar, a global authorization system, it scales trillions of access control lists and millions of authorization requests per second

TensorFlow 2.0 beta releases with distribution strategy, API freeze, easy model building with Keras and more

Google Research Football Environment: A Reinforcement Learning environment for AI agents to master football

Microsoft quietly deleted 10 million faces from MS Celeb, the world’s largest facial recognition database

YouTube’s new policy to fight online hate and misinformation misfires due to poor execution, as usual