Data | 0 articles | Tech News, Tutorials & Expert Insights

article-image-mozilla-developers-have-built-bugbug-which-uses-machine-learning-to-triage-firefox-bugs

10 Apr 2019

3 min read

Mozilla developers have built BugBug which uses machine learning to triage Firefox bugs

10 Apr 2019

Yesterday the team at Mozilla announced that the company is receiving hundreds of bug reports and feature requests from Firefox users on a daily basis. The team noted that it’s important to get the bugs fixed as soon as possible for the smooth functioning of the systems. Also, the developers should quickly come to know that there is a bug in order to fix it. Bug triage, a process where tracker issues are screened and prioritised can be useful in such cases. However, even when developers come to know that bugs exist in the system, it is still difficult for the developers to closely look at each bug. The team at Mozilla has been using Bugzilla since years now which is a web-based general-purpose bugtracker and testing tool that group the bugs by product. But product assignment or the grouping process was done manually by the developers so this process failed to scale. Now Mozilla is experimenting with Machine Learning to train systems to triage bugs. BugBug It’s important to get the bugs in the eye of the right set of engineers, for which the team at Mozilla developed BugBug, a machine learning tool that assigns a product and component automatically for every new untriaged bug. By bringing the bugs into the radar of the triage owners, the team at Mozilla has made an effort towards decreasing the turnaround time to fix new issues. Training the BugBug model Mozilla has a large training set of data for this model which includes two decades worth of bugs that have been reviewed by Mozillians and assigned to products and components. The bug data can’t be used as-is and any change to the bug after triage would create trouble during operation. So the team at Mozilla rolled back the bug to the time it was originally filed. Out of 396 components, 225 components had more than 49 bugs filed in the past 2 years. During operation, the team performed the assignment when the model was confident enough of its decision and currently, the team is using a 60% confidence threshold. Ever since the team has deployed BugBug in production at the end of February 2019, they have triaged around 350 bugs. The median time for any developer to act on triaged bugs is 2 days. Usually, 9 days is the average time to act, but with BugBug the Mozilla team took just 4 days to remove the outliers. Mozilla plans to use Machine learning in the future The Mozilla team has planned to use machine learning to assist in other software development processes, such as identifying duplicate bugs, providing automated help to developers, and detecting the bugs important for a Firefox release. The team plans to extend BugBug to automatically assign components for other Mozilla products. To know more about this news, check out the post by Mozilla. Mozilla is exploring ways to reduce notification permission prompt spam in Firefox Mozilla launches Firefox Lockbox, a password manager for Android Mozilla’s Firefox Send is now publicly available as an encrypted file sharing service

0
0
2834

article-image-china-will-ban-cryptomining-to-prevent-wastage-of-resources

Savia Lobo

10 Apr 2019

2 min read

China will ban cryptomining to prevent wastage of resources

Savia Lobo

10 Apr 2019

2 min read

Yesterday, China’s National Development and Reform Commission (NDRC) published a proposal to ban cryptocurrency mining with a reason stating that crypto mining is a waste of valuable resources. The team is waiting for public feedback over this proposal and they also indicated that “the crypto-mining ban could take effect as soon as they’re formally issued”, the Bloomberg reports. The proposal also includes a “revised list--first published in 2011--of industries it wants to encourage, restrict or eliminate”, Reuters reports. The public can comment on the draft latest by May 7. Cryptocurrencies such as Bitcoin use specialized computers that use a huge amount of energy. According to the South China Morning Post, China’s coal-rich regions, Xinjiang and Inner Mongolia have become popular destinations for crypto-miners looking for cheap electricity. “It’s estimated that as much as 74 percent of global crypto mining is occurring in China, a place where it’s also the most carbon-intensive”, Gizmodo reports. A recent report in Nature Sustainability says crypto mining emits anywhere between 3 million and 15 million tons of carbon dioxide globally. In 2017, China banned initial coin offerings and also put a halt to virtual currency trading. In 2018, Chinese officials outlined proposals to discourage crypto mining. Jehan Chu, a managing partner at blockchain investment firm Kenetic, said, “The NDRC’s move is in line overall with China’s desire to control different layers of the rapidly growing crypto industry, and does not yet signal a major shift in policy.” According to Reuters, “The draft for a revised list added cryptocurrency mining, including that of bitcoin, to more than 450 activities the NDRC said should be phased out as they did not adhere to relevant laws and regulations, were unsafe, wasted resources or polluted the environment.” An executive who works closely with Chinese mining firms told Wired, “although the ban was widely expected to move forward, miners expect it will take years for the government to fully rein in their operations.” To know more on this news in detail, head over to Bloomberg. Crypto-cash is missing from the wallet of dead cryptocurrency entrepreneur Gerald Cotten – find it, and you could get $100,000 FOSDEM 2019: Designing better cryptographic mechanisms to avoid pitfalls – Talk by Maximilian Blochberger Google expands its Blockchain search tools, adds six new cryptocurrencies in BigQuery Public Datasets

0
0
1280

article-image-horovod-an-open-source-distributed-training-framework-by-uber-for-tensorflow-keras-pytorch-and-mxnet

Natasha Mathur

09 Apr 2019

3 min read

Horovod: an open-source distributed training framework by Uber for TensorFlow, Keras, PyTorch, and MXNet

Natasha Mathur

09 Apr 2019

3 min read

The LF Deep Learning Foundation, a community umbrella project of The Linux Foundation, announced Horovod, started by Uber in 2017, as their new project, last year in December. Uber joined Linux Foundation in November 2018 to support LF Deep Learning Foundation open source projects. Horovod (named after a traditional Russian dance) announced at 2018 KubeCon + CloudNativeCon North America, is an open source distributed training framework for TensorFlow, Keras, MXNet, and PyTorch. It helps improve speed, as well as scales and resource allocation in machine learning training activities. The main goal of Horovod is to simplify distributed Deep Learning and make it fast. Ever since its release, Horovod has been getting leveraged across different tasks and by different companies. For instance, Uber has been using Horovod for self-driving vehicles, fraud detection, and trip forecasting. Other companies using Horovod include Alibaba, Amazon, and NVIDIA. Other contributors to the Horovod Project are Amazon, IBM, Intel, and NVIDIA. IBM uses Horovod as part of its open source deep learning solution, FfDL, and in its IBM Watson Studio. Databricks also features Horovod in their deep learning offering. Similarly, NVIDIA announced last November that it is using Uber’s Horovod to build an AI computing platform for developers of self-driving vehicles. Molly Vorwerck, Editorial Program Manager for Uber Engineering, mentioned that “Horovod was a clear choice for NVIDIA. With only a few lines of code, Horovod allowed them to scale from one to eight GPUs, optimizing model training for their self-driving sensing and perception technologies, leading to faster, safer systems”. Horovod makes it easy to take a single-GPU TensorFlow program and train it on many GPUs. Also, it is easier to achieve improved GPU resource usage figures with Horovod. It makes use of advanced algorithms and features high-performance networks that offer data scientists and other researchers the tooling to easily scale their deep learning models with high performance. Also, the open source community’s response was also very positive about Horovod. “It was very cool to see my first open source project reach so many people and be adopted so quickly..now, when I go to conferences people actually know of Horovod and they’re excited to integrate with it...all these things make me really happy”, states Alex Sergeev, Horovod Project Lead. Apart from that, Horovod also joined the existing Linux Foundation Deep Learning projects, namely, Acumos AI (an open source AI framework), Angel (a high-performance distributed machine learning platform), and EDL (Elastic Deep Learning framework). These projects have been designed to help cloud service providers build cluster cloud services using deep learning frameworks. “Uber built Horovod to make deep learning model training faster and more intuitive for AI researchers across industries. As Horovod continues to mature in its functionalities and applications, this collaboration will enable us to further scale its impact in the open source ecosystem for the advancement of AI,” said Sergeev. For more information, check out the official Horovod blog post. Uber open-sources Peloton, a unified Resource Scheduler Uber releases Ludwig, an open source AI toolkit that simplifies training deep learning models Uber releases AresDB, a new GPU-powered real-time Analytics Engine

0
0
4522

article-image-facebook-ai-introduces-aroma-a-new-code-recommendation-tool-for-developers

Natasha Mathur

09 Apr 2019

3 min read

Facebook AI introduces Aroma, a new code recommendation tool for developers

Natasha Mathur

09 Apr 2019

3 min read

Facebook AI team announced a new tool, called Aroma, last week. Aroma is a code-to-code search and recommendation tool that makes use of machine learning (ML) to simplify the process of gaining insights from big codebases. Aroma allows engineers to find common coding patterns easily by making a search query without any need to manually browse through code snippets. This, in turn, helps save time in their development workflow. So, in case a developer has written code but wants to see how others have implemented the same code, he can run the search query to find similar code in related projects. After the search query is run, results for codes are returned as code ‘recommendations’. Each code recommendation is built from a cluster of similar code snippets that are found in the repository. Aroma is a more advanced tool in comparison to the other traditional code search tools. For instance, Aroma performs the search on syntax trees. Instead of looking for string-level or token-level matches, Aroma can find instances that are syntactically similar to the query code. It can then further highlight the matching code by cutting down the unrelated syntax structures. Aroma is very fast and creates recommendations within seconds for large codebases. Moreover, Aroma’s core algorithm is language-agnostic and can be deployed across codebases in Hack, JavaScript, Python, and Java. How does Aroma work? Aroma follows a three-step process to make code recommendations, namely, Feature-based search, re-ranking and clustering, and intersecting. For feature-based search, Aroma indexes the code corpus as a sparse matrix. It parses each method in the corpus and then creates its parse tree. It further extracts a set of structural features from the parse tree of each method. These features capture information about variable usage, method calls, and control structures. Finally, a sparse vector is created for each method according to its features and then the top 1,000 method bodies whose dot products are highest are retrieved as the candidate set for the recommendation. Aroma In the case of re-ranking and clustering, Aroma first reranks the candidate methods by their similarity to the query code snippet. Since the sparse vectors contain only abstract information about what features are present, the dot product score is an underestimate of the actual similarity of a code snippet to the query. To eliminate that, Aroma applies ‘pruning’ on the method syntax trees. This helps to discard the irrelevant parts of a method body and helps retain all the parts best match the query snippet. This is how it reranks the candidate code snippets by their actual similarities to the query. Further ahead, Aroma runs an iterative clustering algorithm to find clusters of code snippets similar to each other and consist of extra statements useful for making code recommendations. In the case of intersecting, a code snippet is taken first as the “base” code and then ‘pruning’ is applied iteratively on it with respect to every other method in the cluster. The remaining code after the pruning process is the code which is common among all methods, making it a code recommendation. “We believe that programming should become a semiautomated task in which humans express higher-level ideas and detailed implementation is done by the computers themselves”, states Facebook AI team. For more information, check out the official Facebook AI blog. How to make machine learning based recommendations using Julia [Tutorial] Facebook AI open-sources PyTorch-BigGraph for faster embeddings in large graphs Facebook AI research and NYU school of medicine announces new open-source AI models and MRI dataset

0
0
6465

article-image-ieee-spectrum-ibm-watson-has-a-long-way-to-go-before-it-becomes-an-efficient-ai-doctor

Natasha Mathur

09 Apr 2019

5 min read

IEEE Spectrum: IBM Watson has a long way to go before it becomes an efficient AI doctor

Natasha Mathur

09 Apr 2019

5 min read

Eliza Strickland, the Senior Associate Editor at IEEE Spectrum, a magazine by the Institute of Electrical and Electronics Engineers, published an article, last week. The article talks about how IBM Watson still has a long way to go before it establishes itself as an efficient AI in the healthcare Industry. IBM Watson, a question-answering computer system that is capable of answering questions in natural language, made the hits back in February 2011 when it defeated two human champions in the game of Jeopardy! a popular American Quiz Show. This was also the time when IBM researchers explored the possibilities of extending Watson’s capabilities to ‘revolutionize’ health care. IBM decided to apply the outstanding NLP capabilities of Watson to medicine and even promised a commercial product. The first time IBM showed off Watson’s potential to transform medicine using AI was back in 2014. For the Demo, Watson was fed a bizarre collection of patient symptoms, using which, it produced a list of possible diagnoses. Watson’s memory bank included information on even the rarest of diseases and its processors were totally unbiased in approach, giving it an edge over other AIs for doctors. “If Watson could bring that instant expertise to hospitals and clinics all around the world, it seemed possible that the AI could reduce diagnosis errors, optimize treatments, and even alleviate doctor shortages—not by replacing doctors but by helping them do their jobs faster and better,” writes Strickland. However, despite promising on new projects related to AI commercial products, it could not follow up on that promise. “In the eight years since, IBM has trumpeted many more high-profile efforts to develop AI-powered medical technology—many of which have fizzled and a few of which have failed spectacularly,” writes Strickland. Moreover, the products that have been produced from the IBM Watson Health Division, are more like basic AI assistants that are capable of performing routine tasks, not even close to being an AI doctor. Challenges faced by Watson in the healthcare industry While IBM was considering Watson’s possibilities in the healthcare industry, the most challenging issue at the time was the fact that the bulk of patient data in Medicine, i.e. unstructured data. This includes doctor’s notes and hospital discharge summaries which accounts for about 80 percent of a typical patient’s record and is an amalgamation of jargon, shorthand, and subjective statements. Another challenge faced by IBM Watson is its diagnosis of cancer. Mark Kris, a lung cancer specialist at Memorial Sloan Kettering Cancer Center, New York City along with the other preeminent physicians trained an AI system known as Watson for Oncology in 2015. Watson for Oncology would learn by ingesting the vast medical literature on cancer and the health records of real cancer patients and uncover patterns unknown to humans. The other preeminent physicians at the University of Texas MD Anderson Cancer Center, in Houston, collaborated with IBM to create a tool called Oncology Expert Advisor. Both the products, however, faced severe criticism saying that Watson for Oncology at times provided ‘useless’ and ‘dangerous recommendations’. “A deeper look at these two projects reveals a fundamental mismatch between the promise of machine learning and the reality of medical care—between “real AI” and the requirements of a functional product for today’s doctors”, writes Strickland. Although Watson learned quickly about scanning articles on clinical studies, it was difficult to teach Watson to read the articles the way a doctor would. “The information that physicians extract from an article, that they use to change their care, may not be the major point of the study. Watson’s thinking is based on statistics, so all it can do is gather statistics about main outcomes”, adds Mark Kris. Researchers further found that Watson was also incapable of mining information from patients’ electronic health records. Also, they realized that Watson is incapacitated when it comes to comparing a new patient with another large number of cancer patients to discover hidden patterns. Further, they hoped that Watson would mimic the abilities of expert oncologists, but they were disappointed. Despite some challenges, IBM Watson has also seen its share of success stories. Strickland cites an example of Watson for Genomics developed in partnership with the University of North Carolina, Yale University, and other renowned institutions. The tool helps genetics lab generate reports for practicing oncologists. Watson ingests lists of patient’s genetic mutations and generates a report describing all the relevant drugs and clinical trials in just a few seconds. Moreover, a paper was published by IBM’s partners at the University of North Carolina on the effectiveness of Watson for Genomics in 2017 Effective or not, IBM Watson still has a long queue of hurdles that it needs to cross before IBM reaches its dream of making Watson the impeccable ‘AI doctor’. For more information, check out the official IEEE Spectrum article. IBM CEO, Ginni Rometty, on bringing HR evolution with AI and its predictive attrition AI IBM sued by former employees on violating age discrimination laws in workplace IBM announces the launch of Blockchain World Wire, a global blockchain network for cross-border payments

0
0
3616

article-image-tech-companies-eu-to-face-strict-regulation-on-terrorist-content

Fatema Patrawala

08 Apr 2019

11 min read

Tech companies in EU to face strict regulation on Terrorist content: One hour take down limit; Upload filters and private Terms of Service

Fatema Patrawala

08 Apr 2019

11 min read

Countries around the world are seeking to exert more control over content on the internet – and, by extension, their citizens. With more acts of terrorism taking place everywhere, they are now attaining a kind of online history too. With material like those from the recent Christchurch shooting proliferate as supporters upload them to any media platform they can reach. And lawmakers around the world have had enough, so this year, they hope to enact new legislation that will hold big tech companies like Facebook and Google more accountable for any terrorist-related content they host. The Australian parliament passed legislation to crack down on violent videos on social media. Recently Sen. Elizabeth Warren, US 2020 presidential hopeful proposed to build strong anti-trust laws and break big tech companies like Amazon, Google, Facebook and Apple. And on 3rd April, Elizabeth introduced Corporate Executive Accountability Act, a new piece of legislation that would make it easier to criminally charge company executives when Americans’ personal data is breached. Another news from Washington post states that UK has drafted an aggressive new plan to penalise Facebook, Google and other tech giants that don't stop the spread of harmful content online. Last year, the German parliament enacted the NetzDG law, requiring large social media sites to remove posts that violate certain provisions of the German code, including broad prohibitions on “defamation of religion,” “hate speech,” and “insult.” The removal obligation is triggered not by a court order, but by complaints from users. Companies must remove the posts within 24 hours or seven days facing steep fines if they fail to do so. Joining the bandwagon, Europe has also drafted EU Regulation on preventing the dissemination of terrorist content online. The legislation was first proposed by the EU last September as a response to the spread of ISIS propaganda online which encouraged further attacks. It covers recruiting materials such as displays of a terrorist organization’s strength, instructions for how to carry out acts of violence, and anything that glorifies the violence itself. Social media is an important part of terrorists’ recruitment strategy, say backers of the legislation. “Whether it was the Nice attacks, whether it was the Bataclan attack in Paris, whether it’s Manchester, [...] they have all had a direct link to online extremist content,” says Lucinda Creighton, a senior adviser at the Counter Extremism Project (CEP), a campaign group that has helped shape the legislation. The new laws require platforms to take down any terrorism-related content within an hour of a notice being issued, force them to use a filter to ensure it’s not reuploaded, and, if they fail in either of these duties, allow governments to fine companies up to 4 percent of their global annual revenue. For a company like Facebook, which earned close to $17 billion in revenue last year, that could mean fines of as much as $680 million (around €600 million). Advocates of the legislation say it’s a set of common-sense proposals that are designed to prevent online extremist content from turning into real-world attacks. But critics, including Internet Freedom think tanks and big tech firms, claim the legislation threatens the principles of a free and open internet, and it may jeopardize the work being done by anti-terrorist groups. The proposals are currently working their way through the committees of the European Parliament, so a lot could change before the legislation becomes law. Both sides want to find a balance between allowing freedom of expression and stopping the spread of extremist content online, but they have very different ideas about where this balance lies. Why is the legislation needed? Terrorists use social media to promote themselves, just like big brands do. Organizations such as ISIS use online platforms to radicalize people across the globe. Those people may then travel to join the organization’s ranks in person or commit terrorist attacks in support of ISIS in their home countries. At its peak, ISIS has had a devastatingly effective social media strategy, which both instilled fear in its enemies and recruited new supporters. In 2019, the organization’s physical presence in the Middle East has been all but eliminated, but the legislation’s supporters argue that this means there’s an even greater need for tougher online rules. As the group’s physical power has diminished, the online war of ideas is more important than ever. The recent attack in New Zealand where the alleged shooter identified as a 28 year old Australian man, Brenton Tarrant announced the attack on the anonymous-troll message board 8chan. He posted images of the weapons days before the attack, and made an announcement an hour before the shooting. On 8chan, Facebook and Twitter, he also posted links to a 74-page manifesto, titled “The Great Replacement,” blaming immigration for the displacement of whites in Oceania and elsewhere. The manifesto cites “white genocide” as a motive for the attack, and calls for “a future for white children” as its goal. Further he live-streamed the attacks on Facebook, YouTube; and posted a link to the stream on 8chan. “Every attack over the last 18 months or two years or so has got an online dimension. Either inciting or in some cases instructing, providing instruction, or glorifying,” Julian King, a British diplomat and European commissioner for the Security Union, told The Guardian when the laws were first proposed. With the increasing frequency with which terrorists become “self-radicalized” by online material shows the importance of the proposed laws. One-hour takedown limit; upload filters & private Terms of Service The one-hour takedown is one of two core obligations for tech firms proposed by the legislation. Under the proposals, each EU member state will designate a so-called “competent authority.” It’s up to each member state to decide exactly how this body operates, but the legislation says they’re responsible for flagging problematic content. This includes videos and images that incite terrorism, that provide instructions for how to carry out an attack, or that otherwise promote involvement with a terrorist group. Once content has been identified, this authority would then send out a removal order to the platform that’s hosting it, which can then delete it or disable access for any users inside the EU. Either way, action needs to be taken within one hour of a notice being issued. It’s a tight time limit, but removing content this quickly is important to stop its spread, according to Creighton. This obligation is similar to voluntary rules that are already in place that encourage tech firms to take down content flagged by law enforcement and other trusted agencies in an hour. Another part is the addition of a legally mandated upload filter, which would hypothetically stop the same pieces of extremist content from being continuously reuploaded after being flagged and removed — although these filters have sometimes been easy to bypass in the past. “The frustrating thing is that [extremist content] has been flagged with the tech companies, it’s been taken down and it’s reappearing a day or two or a week later,” Creighton says, “That has to stop and that’s what this legislation targets.” The other part is the prohibition of content using private Terms of Service (TOS), rather than national law, and to take down more material than the law actually requires. This effectively increases the power of authorities in any EU Member State to suppress information that is legal elsewhere in the EU. For example, authorities in Hungary and authorities in Sweden may disagree about a news organization sharing an interview with a current or former member of a terrorist organization that it is “promoting” or “glorifying” terrorism. Or they may differ on the legitimacy of a civil society organizations advocacy on complex issues in Chechnya, Israel, or Kurdistan. This regulation gives platforms reason to use their TOS to accommodate whichever authority wants such content taken down – and to apply that decision to users everywhere. What’s the problem with the legislation? Critics say that the upload filter could be used by governments to censor their citizens, and that aggressively removing extremist content could prevent non-governmental organizations from being able to document events in war-torn parts of the world. One prominent opponent is the Center for Democracy and Technology (CDT), a think tank funded in part by Amazon, Apple, Facebook, Google, and Microsoft. Earlier this year, it published an open letter to the European Parliament, saying the legislation would “drive internet platforms to adopt untested and poorly understood technologies to restrict online expression.” The letter was co-signed by 41 campaigners and organizations, including the Electronic Frontier Foundation, Digital Rights Watch, and Open Rights Group. “These filtering technologies are certainly being used by the big platforms, but we don’t think it’s right for government to force companies to install technology in this way,” the CDT’s director for European affairs, Jens-Henrik Jeppesen, told The Verge in an interview. Removing certain content, even if a human moderator has correctly identified it as extremist in nature, could prove disastrous for the human rights groups that rely on them to document attacks. For instance, in the case of Syria’s civil war, footage of the conflict is one of the only ways to prove when human rights violations occur. But between 2012 and 2018, Google took down over 100,000 videos of attacks that were carried out in Syria’s civil war, which destroyed vital evidence of what took place. The Syrian Archive, an organization that aims to verify and preserve footage of the conflict, has been forced to backup footage on its own site to prevent the records from disappearing. Opponents of the legislation like the CDT also say that the filters could end up acting like YouTube’s frequently criticized Content ID system. This ID allows copyright owners to file takedowns on videos that use their material, but the system will sometimes remove videos posted by their original owners, and they can misidentify original clips as being copyrighted. It can also be easily circumvented. Opponents of the legislation also believe that the current voluntary measures are enough to stop the flow of terrorist content online. They claim the majority of terrorist content has already been removed from the major social networks, and that a user would have to go out of their way to find the content on a smaller site. “It is disproportionate to have new legislation to see if you can sanitize the remaining 5 percent of available platforms,” Jeppesen says. These organizations need to be able to view this content, no matter how troubling it might be, in order to investigate war crimes. Their independence from governments is what makes their work valuable, but it could also mean they’re shut out under the new legislation. While Lucinda doesn’t believe free and public access to this information is the answer. She argues that needing to “analyze and document recruitment to ISIS in East London” isn’t a good enough excuse to leave content on the internet if the existence of that content “leads to a terrorist attack in London, or Paris or Dublin.” The legislation is currently working its way through the European Parliament, and its exact wording could yet change. At the time of publication, the legislation’s lead committee is currently due to vote on its report on the draft regulation. After that, it must proceed through the trilogue stage — where the European Commission, the Council of the European Union, and the European Parliament debate the contents of the legislation — before it can finally be voted into law by the European Parliament. Because the bill is so far away from being passed, neither its opponents nor its supporters believe a final vote will take place any sooner than the end of 2019. That’s because the European Parliament’s current term ends next month, and elections must take place before the next term begins in July. Here’s the link to the proposed bill by the European Commission. How social media enabled and amplified the Christchurch terrorist attack Tech regulation to an extent of sentence jail: Australia’s ‘Sharing of Abhorrent Violent Material Bill’ to Warren’s ‘Corporate Executive Accountability Act’ EU’s antitrust Commission sends a “Statements of Objections” to Valve and five other video game publishers for “geo-blocking” purchases

0
0
3178

article-image-fabrice-bellard-the-creator-of-ffmpeg-and-qemu-introduces-a-lossless-data-compressor-which-uses-neural-networks

Bhagyashree R

08 Apr 2019

3 min read

Fabrice Bellard, the creator of FFmpeg and QEMU, introduces a lossless data compressor which uses neural networks

Bhagyashree R

08 Apr 2019

3 min read

Last month, Fabrice Bellard and his team published a paper named Lossless Data Compression with Neural Networks. This paper talks about lossless data compressors that use pure NN models based on Long Short-Term Memory (LSTM) and Transformer models. How does this model work? This lossless data compressor uses the traditional predictive approach: At each time, the encoder computes the probability vector of the next symbol value with the help of the neural network model knowing all the preceding symbols. The actual symbol value is then encoded using an arithmetic encoder and the model is updated of knowing the symbol value. As the decoder works symmetrically, meaning that both the encoder and decoder update their model identically, there is no need to transmit the model parameters. To improve the compression ratio and speed, a preprocessing stage was added. For small LSTM models, the team reused the text preprocessor CMIX and lstm-compress to have a meaningful comparison. The larger models used subword based preprocessor where each symbol represents a sequence of bytes. The model uses something called arithmetic coding, a standard compression technique. This model tries to make arithmetic coding adaptive. Here’s an example by a Reddit user that explains how exactly this model works: “The rough frequency of `e' in English is about 50%. But if you just saw this partial sentence "I am going to th", the probability/frequency of `e' skyrockets to, say, 98%. In standard arithmetic coding scheme, you would still parametrize you encoder with 50% to encode the next "e" despite it's very likely (~98%) that "e" is the next character (you are using more bits than you need in this case), while with the help of a neural network, the frequency becomes adaptive.” To ensure that both the decoder and encoder are using the exact same model, the authors have developed a custom C library called LibNC. This library is responsible for implementing the various operations needed by the models. It has no dependency on any other libraries and has a C API. Results of the experiment Performance of the model was evaluated against enwik8 Hutter Prize benchmark. The models show slower decompression speed, 1.5x slower for the LSTM model and 3x slower for the Transformer model. But, its description is simple and the memory consumption is reasonable as compared to other compressors giving similar compression ratio. Speaking of the compression ratio, the models are yet to reach the performance of CMIX, a lossless data compressor that gives optimized compression ratio at the cost of high CPU/memory usage. In all the experiments, the Transformer model gives worse performance than the LSTM model although it gives the best performance in language modeling benchmarks. To know more in detail, check out the paper, Lossless Data Compression with Neural Networks. Microsoft open-sources Project Zipline, its data compression algorithm and hardware for the cloud Making the Most of Your Hadoop Data Lake, Part 1: Data Compression Interpretation of Functional APIs in Deep Neural Networks by Rowel Atienza

0
0
4489

article-image-researchers-introduce-a-new-generative-model-hologan-that-learns-3d-representation-from-natural-images

Natasha Mathur

08 Apr 2019

4 min read

Researchers introduce a new generative model, HoloGAN, that learns 3D representation from natural images

Natasha Mathur

08 Apr 2019

4 min read

A group of researchers, namely, Thu Nguyen-Phuoc, Chuan Li, Lucas Theis, Christian Richardt, and Yong-Liang Yang, published a paper titled ‘HoloGAN: Unsupervised learning of 3D representations from natural images, last week. In the paper, researchers have proposed a generative adversarial network (GAN), called HoloGAN, for the task of unsupervised learning of 3D representations from natural images. HoloGAN works by adopting strong ‘inductive biases’ about the 3D world. The paper states that commonly used generative models depend on 2D Kernels to produce images and make assumptions about the 3D world. This is why these models tend to create blurry images in tasks that require a strong 3D understanding. HoloGAN, however, learns a 3D representation of the world and is successful at rendering this representation in a realistic manner. It can be trained using unlabelled 2D images without requiring pose labels, 3D shapes, or multiple views of the same objects. “Our experiments show that using explicit 3D features enables HoloGAN to disentangle 3D pose and identity, which is further decomposed into shape and appearance, while still being able to generate images with similar or higher visual quality than other generative models”, states the paper. How does HoloGAN work? HoloGAN first considers a 3D representation, which is later transformed to a target pose, projected to 2D features, and rendered to generate the final images. HoloGAN then learns perspective projection and rendering of 3D features from scratch with the help of a projection unit. Finally, to generate new views of the same scene, 3D rigid-body transformations are applied to the known 3D features, and the results are visualized using a neural renderer. This produces sharper results than performing 3D transformations in high-dimensional latent vector space. HoloGAN To learn 3D representations from 2D images without labels, HoloGAN extends the capability of traditional unconditional GANs by introducing a strong inductive bias about the 3D world into the generator network. During training, random poses from a uniform distribution are sampled and then the 3D features are transformed using these poses before they are rendered into images. Also, a variety of datasets are used to train HoloGAN, namely, Basel Face, CelebA, Cats, Chairs, Cars, and LSUN bedroom. HoloGAN is trained on a resolution of 64×64 pixels for Cats and Chairs, and 128×128 pixels for Basel Face, CelebA, Cars and LSUN bedroom. Other than that, HoloGAN generates 3D representations using a learned constant tensor. The random noise vector instead is treated as a “style” controller and gets mapped to affine parameters for adaptive instance normalization (AdaIN) using a multilayer perceptron. Results and Conclusion In the paper, researchers prove that HoloGAN can generate images with comparable or greater visual fidelity than other 2D-based GAN models. HoloGAN can also learn to disentangle challenging factors in an image, such as 3D pose, shape, and appearance. The paper also shows that HoloGAN can successfully learn meaningful 3D representations across different datasets with varying levels of complexity. “We are convinced that explicit deep 3D representations are a crucial step forward for both the interpretability and controllability of GAN models, compared to existing explicit or implicit 3D representations”, reads the paper. However, researchers state that while HoloGAN is successful at separating pose from identity, its performance largely depends on the variety and distribution of poses included in the training dataset. The paper cites an example of the CelebA and Cats dataset where the model cannot recover elevation as well as azimuth. This is due to the fact that most face images are taken at eye level, thereby, containing limited variation in elevation. Future work The paper states that researchers would like to explore learning distribution of poses from the training data in an unsupervised manner for uneven pose distributions. Other directions that can be explored include further disentanglement of objects’, and appearances ( texture and illumination). Researchers are also looking into combining the HoloGAN with training techniques suchas progressive GANs to produce higher-resolution images. For more information, check out the official research paper. DeepMind researchers provide theoretical analysis on recommender system, ‘echo chamber’ and ‘filter bubble effect’ Google AI researchers introduce PlaNet, an AI agent that can learn about the world using only images Facebook researchers show random methods without any training can outperform modern sentence embeddings models for model classification

0
0
2758

article-image-ml-net-1-0-rc-releases-with-support-for-tensorflow-models-and-much-more

Amrata Joshi

08 Apr 2019

2 min read

ML.NET 1.0 RC releases with support for TensorFlow models and much more!

Amrata Joshi

08 Apr 2019

2 min read

Last week, the team behind ML.NET announced the release of ML.NET 1.0 RC (Release Candidate), an open-source and cross-platform machine learning framework for .NET developers. ML.NET 1.0 RC is the last preview release before releasing the final ML.NET 1.0 RTM (Requirements Traceability Matrix) this year. Developers can use ML.NET in sentiment analysis, product recommendation, spam detection, image classification, and much more. What’s new in ML.NET 1.0 RC? Preview packages According to the Microsoft blog, “Heading ML.NET 1.0, most of the functionality in ML.NET (around 95%) is going to be released as stable (version 1.0).” The packages that will be available for the preview state are TensorFlow, Onnx components, TimeSeries components, and recommendation components. IDataView moved to Microsoft.ML namespace In this release, IDataView has been moved back into Microsoft. ML namespace based on feedback the team received. Support for TensorFlow models This release comes with added support for TensorFlow models, an open source machine learning framework used for deep learning projects. The issues in ML.NET version 0.11 related to TensorFlow models have been fixed in this release. Major changes in ML.NET 1.0 RC The ‘Data’ namespace has been removed in this release with the help using Microsoft.Data.DataView. The Nuget package has been added for Microsoft.ML.FastTree. Also, PoissonRegression has been changed to LbfgsPoissonRegression. To know more about this release, check out the official announcement. .NET team announces ML.NET 0.6 Qml.Net: A new C# library for cross-platform .NET GUI development ML.NET 0.4 is here with support for SymSGD, F#, and word embeddings transform!A

0
0
2358

article-image-ian-goodfellow-quits-google-and-joins-apple-as-a-director-of-machine-learning

Amrata Joshi

08 Apr 2019

3 min read

Ian Goodfellow quits Google and joins Apple as a director of machine learning

Amrata Joshi

08 Apr 2019

3 min read

Last month, Apple took a major step towards strengthening its AI team by hiring Ian Goodfellow, a researcher in the field of machine learning. He is best known for his projects in artificial intelligence at Google as the director of machine learning in the Special Projects Group. The news broke just last week when Goodfellow updated his LinkedIn profile from Senior Staff Research Scientist at Google to Director of machine learning at Apple. https://twitter.com/iDownloadBlog/status/1114222866956980226 Andrew Ng, former head of Baidu AI Group/Google Brain gave best wishes to Ian Goodfellow, saying his former student Ian joins Apple. https://twitter.com/AndrewYNg/status/1114304151809224704 Ian Goodfellow’s tech contribution Ian Goodfellow is a researcher in machine learning who has contributed towards Generative Adversarial Networks, or GANs. Goodfellow has previously interned at Willow Garage, the robotics research lab that brought the Robotics Operating System. At Google, he contributed to TensorFlow related project and also was a member of the Google Brain team. He has previously worked at OpenAI, an AI research consortium which was originally funded by Elon Musk and other tech leaders. Apple’s interest in AI It seems Apple is trying to rank higher amongst the FAANG (Facebook, Amazon, Apple, Netflix, Google) in the field of Artificial Intelligence. Just last year Apple hired John Giannandrea as the senior vice president of Machine Learning and AI Strategy who previously worked as the head of AI and search at Google and managed the AI strategy there. https://twitter.com/teslainvernon/status/1113977211369824256 Apart from exploring AI for features like FaceID and Siri, Apple has been also working on autonomous driving technology. Tim Cook, CEO at Apple has referred autonomous vehicle as “the mother of all AI projects” in scale which is also supposed to be a project similar to the Special Projects. Apple has reportedly absorbed the co-founders and few staff members of Lighthouse, maker of AI-powered home security cameras after buying the patents of Lighthouse AI. Tough times for Google The news regarding the switch of Goodfellow to Apple comes when the newly formed Advanced Technology External Advisory Council aimed to help Google with the major issues in AI such as facial recognition and machine learning fairness, gets dissolved. The AI team at Google seems to be in a major fix with the latest development happening in the company. To know more about this news, check out the post by CNBC. ACLU (American Civil Liberties Union) file a complaint against the border control officers for violating the constitutional rights of an Apple employee Google dissolves its Advanced Technology External Advisory Council in a week after repeat criticism on selection of members Apple officially cancels AirPower; says it couldn’t meet hardware’s ‘high standards’

0
0
2893

article-image-ibm-ceo-ginni-rometty-on-bringing-hr-evolution-with-ai-and-its-predictive-attrition-ai

Natasha Mathur

05 Apr 2019

4 min read

IBM CEO, Ginni Rometty, on bringing HR evolution with AI and its predictive attrition AI

Natasha Mathur

05 Apr 2019

4 min read

On Wednesday, CNBC held its At Work Talent & HR: Building the Workforce of the Future Conference in New York. Ginni Rometty, IBM CEO (also appointed to Trump’s American Workforce Policy Board) had a discussion around several strategies and announcements regarding job change due to AI and IBM’s predictive attrition AI. Rometty shared details about an AI that IBM HR department has filed a patent for, as first reported by CNBC. The AI is developed with Watson (IBM’S Q&A AI), for “predictive attrition Program”, which can predict at 95% of accuracy about which employees are about to quit. It will also prescribe remedies to managers for better employee engagement. The AI retention tool is part of IBM products designed to transform the traditional approach to HR management. Rometty also mentioned that since IBM has implemented AI more widely, it has been able to reduce the size of its global human resources department by 30 percent. Rometty states that AI will be effective at tasks where HR departments and corporate managers are not very effective. It will keep employees on a clear career path and will also help identify their skills. Rometty mentions that many companies fail to be 100% transparent with the employees regarding their individual career path and growth which is a major issue. But, IBM AI will be able to better understand the data patterns and adjacent skills, which in turn, will help with identifying an individual’s strength. "We found manager surveys were not accurate. Managers are subjective in ratings. We can infer and be more accurate from data”, said Rometty. IBM also eradicated the annual performance review method. "We need to bring AI everywhere and get rid of the [existing] self-service system," Rometty said. This is because AI will now help the IBM employees better understand which programs they need to get growth in their career. Also, poor performance is no longer a problem as IBM is using "pop-up" solution centers that help managers seek the expected and better performance from their employees. "I expect AI to change 100 percent of jobs within the next five to 10 years," said Rometty. The need for “skill revolution” has already been an ongoing topic of discussion in different organizations and institutions across the globe, as AI keeps advancing. For instance, the Bank of England’s chief economist, Andy Haldane, gave a warning, last year, that the UK needs to skill up overall and across different sectors (tech, health, finance, et al) as up to 15 million jobs in Britain are at stake. This is because artificial intelligence is replacing a number of jobs which were earlier a preserve of humans. But, Rometty has a remedy to prevent this “technological unemployment” in the future. She says, “to get ready for this paradigm shift companies have to focus on three things: retraining, hiring workers that don't necessarily have a four-year college degree and rethinking how their pool of recruits may fit new job roles”. IBM also plans to invest $1 billion in training workers for “new collar” jobs, for workers with tech-skills will be hired without a four-year college degree. These "new collar" jobs could include working at a call center, app development, or a cyber-analyst at IBM via P-TECH (Pathways in Technology Early College High School) program. P-TECH is a six-year-long course that starts with high school and involves an associate's degree. Other measures by IBM include CTA apprenticeship Coalition program, which is aimed at creating thousands of new apprenticeships in 20 states in the US. These apprenticeships come with frameworks for over 15 different roles across fields including software engineering, data science and analytics, cybersecurity, creative design, and program management. As far as employers are concerned, Rometty advises to “bring consumerism into the HR model. Get rid of self-service, and using AI and data analytics personalize ways to retrain, promote and engage employees. Also, move away from centers of excellence to solution centers”. For more information, check out the official conversation with Ginni Rometty at CNBC @Work Summit. IBM sued by former employees on violating age discrimination laws in workplace Diversity in Faces: IBM Research’s new dataset to help build facial recognition systems that are fair IBM launches Industry’s first ‘Cybersecurity Operations Center on Wheels’ for on-demand cybersecurity support

0
0
3688

article-image-amazon-alexa-is-hipaa-compliant-bigger-leap-in-the-health-care-sector

Amrata Joshi

05 Apr 2019

4 min read

Amazon Alexa is HIPAA-compliant: bigger leap in the health care sector

Amrata Joshi

05 Apr 2019

4 min read

Amazon has been exploring itself in the health care sector since quite some time now. Just last year, Amazon bought the online pharmacy PillPack for $1 billion in order to sell the prescription drugs. The company introduced Amazon Comprehend Medical, a machine learning tool that allows users to extract relevant clinical information from the unstructured text in patient records. Amazon is even working with Accenture and Merck to develop a cloud-based platform for collaborators across the life sciences industry with a motive to bring innovation in the drug development research. Amazon has now taken a bigger leap by announcing its voice assistant, Alexa as HIPAA (Health Insurance Portability and Accountability Act) compliant, which means that it can work with health care and medical software developers in order to invent new programs or skills with voice and provide better experiences to their customers. With the help of Amazon Alexa, developers will design new skills to help customers manage their healthcare needs at home by simply using voice. Patients will now be able to book a medical appointment, access the hospital post-discharge instructions or check on the status of a prescription delivery, and much more just via the voice! HIPAA has been designed to protect patients in cases where their personal health information is shared with health care organizations such as hospitals. This will allow healthcare companies to build Alexa voice tools capable of securely transmitting the patient’s private information. The consumers will now be able to use new Alexa health skills for asking questions such as “Alexa, pull up my blood glucose readings” or “Alexa, find me a doctor,” and will receive a response from the voice assistant. The company further announced the launch of six voice programs including Express Scripts, My Children's Enhanced Recovery After Surgery (ERAS), Cigna Health Today, Swedish Health Connect, Atrium Health, and Livongo. These new tools allow patients to use Alexa for accessing personalized information such as prescription, progress updates after surgery, and much more. Rachel Jiang, a member of Amazon’s health and wellness team, who previously worked at Microsoft and Facebook announced that Amazon has invited six healthcare partners to use its HIPAA-compliant skills kit to build voice programs. But the company expects to get more healthcare providers on board to access its information. Jiang wrote in a post, “These new skills are designed to help customers manage a variety of healthcare needs at home simply using voice – whether it’s booking a medical appointment, accessing hospital post-discharge instructions, checking on the status of a prescription delivery, and more.” Boston Children’s Hospital now has a new HIPAA-compliant skill dubbed “ERAS” for kids that are discharged from the hospital and for their families. With the help of Alexa’s voice assistant, patients and their families or caregivers can now ask questions to the care team about their case. Even the doctors can now remotely check in on the child’s recovery process. Livongo, a digital health start-up, works with employers in order to help them in managing workers with chronic medical conditions. Livongo developed a skill for people with diabetes that uses connected glucometers that would ask about the patient’s blood sugar levels. In a statement to CNBC, Livongo’s president Jenny Schneider told that “There are lots of reasons she expects users to embrace voice technologies, versus SMS messaging or other platforms. Some of those people might have difficulty reading, or they just have busy lives and it’s just an easy option.” Express Scripts, a pharmacy benefit management organization is working towards building a way for members to check the status of their home delivery prescription via Alexa. Voice technology has been booming in the health care sector and skills like the ones mentioned above will bring health care to home and make the patients lives easy and cost-effective. John Brownstein, chief innovation officer for Boston Children’s Hospital, said, “We’re in a renaissance of voice technology and voice assistants in health care. It’s so appealing as there’s very little training, it’s low cost and convenient.” To know more about this news, check out Amazon’s official announcement. Amazon won’t be opening its HQ2 in New York due to public protests MariaDB announces MariaDB Enterprise Server and welcomes Amazon’s Mark Porter as an advisor to the board of directors Over 30 AI experts join shareholders in calling on Amazon to stop selling Rekognition, its facial recognition tech, for government surveillance

0
0
2302

article-image-google-dissolves-its-advanced-technology-external-advisory-council-in-a-week-after-repeat-criticism-on-selection-of-members

Amrata Joshi

05 Apr 2019

3 min read

Google dissolves its Advanced Technology External Advisory Council in a week after repeat criticism on selection of members

Amrata Joshi

05 Apr 2019

3 min read

Last week Google announced the formation of Advanced Technology External Advisory Council, to help the company with the major issues in AI such as facial recognition and machine learning fairness. And it is only a week later that Google has decided to dissolve the council, according to reports by Vox. In a statement to Vox, a Google spokesperson said that “the company has decided to dissolve the panel, called the Advanced Technology External Advisory Council (ATEAC), entirely.” The company further added, “It’s become clear that in the current environment, ATEAC can’t function as we wanted. So we’re ending the council and going back to the drawing board. We’ll continue to be responsible in our work on the important issues that AI raises, and will find different ways of getting outside opinions on these topics.” This news comes immediately after a group of Google employees criticized the selection of the council and insisted the company to remove Kay Coles James, the Heritage Foundation President for her anti-trans and anti-immigrant thoughts. The presence of James in the council had somewhere made the others uncomfortable too. When Joanna Bryson was asked by one of the users on Twitter, if she was comfortable serving on a board with James, she answered, “Believe it or not, I know worse about one of the other people.” https://twitter.com/j2bryson/status/1110632891896221696 https://twitter.com/j2bryson/status/1110628450635780097 Few researchers and civil society activists had also voiced their opinion against the idea of anti-trans and anti-LGBTQ. Alessandro Acquisti, a behavioural economist and privacy researcher, had declined an invitation to join the council. https://twitter.com/ssnstudy/status/1112099054551515138 Googlers also insisted on removing Dyan Gibbens, the CEO of Trumbull Unmanned, a drone technology company, from the board. She has previously worked on drones for the US military. Last year, Google employees were agitated about the fact that the company had been working with the US military on drone technology as part of so-called Project Maven. A lot of employees decided to resign because of this reason, though later promised to not renew Maven. While talking more on the ethics front, Google has even offered resources to the US Department of Defense for a “pilot project” to analyze drone footage with the help of artificial intelligence. The question that arises here, “Are Googlers and Google’s shareholders comfortable with the idea of getting their software used by the US military?” President Donald Trump’s meet with the Google CEO, Sundar Pichai adds more to it. https://twitter.com/realDonaldTrump/status/1110989594521026561 Though this move by Google seems to be a mark of victory for more than 2300 Googlers and supporters who signed the petition and took a stand against Transphobia, it is still going to be a tough time for Google to redefine its AI ethics. Also, the company might have saved itself from this major turmoil if they had wisely selected the council members. https://twitter.com/EthicalGooglers/status/1113942165888094215 To know more about this news, check out the blog post by Vox. Google employees filed petition to remove anti-trans, anti-LGBTQ and anti-immigrant Kay Coles James from the AI council Is Google trying to ethics-wash its decisions with its new Advanced Tech External Advisory Council? Amazon joins NSF in funding research exploring fairness in AI amidst public outcry over big tech #ethicswashing

0
0
3694

article-image-aclu-american-civil-liberties-union-file-a-complaint-against-the-border-control-officers-for-violating-the-constitutional-rights-of-an-apple-employee

Amrata Joshi

04 Apr 2019

5 min read

ACLU (American Civil Liberties Union) file a complaint against the border control officers for violating the constitutional rights of an Apple employee

Amrata Joshi

04 Apr 2019

5 min read

United States Customs and Border Protection (CBP), the largest federal law enforcement agency of the United States Department of Homeland Security has become the talk of the town after its recent act of violating an Apple employee’s constitutional rights. Just two days ago, the American Civil Liberties Union (ACLU) filed a complaint against the border control officers for violating the constitutional rights of Dr. Andreas Gal, an Apple employee while he was returning from a business trip to Sweden. https://twitter.com/andreasgal/status/1113160811206213632 Who is Dr. Andreas Gal? Dr. Andreas Gal, a successful entrepreneur and technologist, a U.S. citizen who currently is an employee at Apple. He is also the former Chief Executive Officer at Silk Labs and the former Chief Technology Officer of Mozilla Corporation. He holds a PhD in Computer Science from the University of California at Irvine. He has taken a lot of initiatives to prevent warrantless mass surveillance and spread the use of encryption at Mozilla. What exactly happened? The letter by ACLU highlighted CBP’s detention and intrusive interrogation of Dr. Gal and they attempted to check his devices. Last year on 29th November, Dr. Gal arrived at San Francisco International Airport (SFO) while returning from a business trip in Sweden. He holds the Global Entry status that allows expedited clearance for pre-approved, low-risk travellers upon arrival in the United States. After landing at SFO, the immigration agents checked his passport on the jetbridge, he then proceeded to the Global Entry kiosk in the customs and border area. Gal was given a receipt marked as ‘Tactical Terrorism Response Teams’ (TTRT) on it. Gal presented the receipt from the kiosk along with his U.S. passport, to an immigration officer there. After reviewing Dr. Gal’s passport and Global Entry receipt, he instructed him to go to Customs Area B but kept Gal’s passport. He was then interrogated by three CBP officers who were armed with holstered handguns at Customs Area B. Dr. Gal was asked a number of questions regarding his travel plans, his employment history, his work at Apple, and his electronic devices. He was even asked questions about his work at Mozilla and travel to Canada. Even after repeatedly being asked by Dr. Gal about the baseless process of interrogation, the officers didn’t answer. In a statement to The Register, he said, "There I quickly found myself surrounded by three armed agents wearing bullet proof vests. They started to question me aggressively regarding my trip, my current employment, and my past work for Mozilla, a non-profit organization dedicated to open technology and online privacy." He was travelling with an Apple iPhone XS and an Apple MacBook Pro, both were issued to him by Apple for software development reasons. The laptop even had a sticker which read, “PROPERTY OF APPLE. PROPRIETARY.” Whereas the phone had a sticker with a serial number but not Apple’s name where its lockscreen displayed “Confidential and Proprietary,” and also had a number to call if the device is found. The officers even asked Dr. Gal to pull up his itinerary on his mobile phone and searched his wallet and all his luggage and asked questions about everything they found. Gal told the officers that he would email them the itinerary instead and he needs to speak with a lawyer and his employer before giving CBP officers full access to his mobile phone. But the CBP officers ordered Dr. Gal to enter the passcode to his mobile phone and laptop and hand it over to them. https://twitter.com/waltmossberg/status/1113538170707173379 According to the ACLU report, “Gal never refused to provide the passcodes to access the electronic devices in his possession, he only asked that he be allowed to consult with an attorney to ensure that he would not violate non-disclosure agreements with his employer.” After the intense interrogation session, Gal was allowed to leave with his devices but without his Global Entry card and he was told that his privileges would be revoked because he refused to comply with the search. Gal said, “My past work on encryption and online privacy is well documented, and so is my disapproval of the Trump administration and my history of significant campaign contributions to Democratic candidates. I wonder whether these CBP [Customs and Border Patrol] programs led to me being targeted." https://twitter.com/redhotnerd/status/1113540610546372608 Are developers who travel endangered? Here it's important to note that the questions that were asked to Dr. Gal were not limited to his identity or citizenship, they were much beyond that. After this incident, a question that arises is how secure are the developers with respect to work-related travels? Developers, data analysts, testers or researchers carry a lot of sensitive data while their work-related travels. If they come across with such an interrogation, then they might end up leaking out their company’s sensitive data and which would go against the company’s law. https://twitter.com/kifleswing/status/1113514144903331840 To know more about this news, check out the post by ACLU. Apple officially cancels AirPower; says it couldn’t meet hardware’s ‘high standards’ Donald Trump called Apple CEO Tim Cook ‘Tim Apple’ Apple’s March Event: Apple changes gears to services, is now your bank, news source, gaming zone, and TV

0
0
2101

article-image-lyft-introduces-amundsen-a-data-discovery-and-metadata-engine-for-its-researchers-and-data-scientists

Amrata Joshi

03 Apr 2019

4 min read

Lyft introduces Amundsen; a data discovery and metadata engine for its researchers and data scientists

Amrata Joshi

03 Apr 2019

4 min read

Yesterday, the team at Lyft introduced a data discovery and metadata engine called Amundsen. Amundsen is introduced to increase the productivity of data scientists and research scientists at Lyft. The team named it Amundsen inspiring from the Norwegian explorer, Roald Amundsen. The aim is to improve the productivity of data users by making their lives simple with this data search interface. According to UNECE (United Nations Economic Commission for Europe), the data in our world has grown over 40x spanning the last 10 years. The growth in data volumes has given rise to major challenges of productivity and compliance issues which were important to solve. The team at Lyft found the solution to these problems in the metadata and not in the actual data. “Metadata, also defined as ‘data about the data’, is a set of data that describes and gives information about other data.” The team solved a part of the productivity problem using the metadata. How did the team come up with Amundsen The team at Lyft realized that the majority of their time was spent in data discovery instead of prototyping and productionalization, where they actually wanted to invest more time. Data discovery involves answering questions like “If a certain type of data exists? Where is it? What is the source of truth of that data? Does it need to be accessed? And similar types of questions are answered in the process. This the reason why the team at Lyft brought the idea of Amundsen inspired a lot by search engines like Google. But Amundsen is more of searching data in the organization. Users can search for data by typing in their search term in the search box. For instance, “election results” or “users”. For ones who aren’t aware of what they are searching for, the platform offers a list of popular tables in the organization to browse through them. [caption id="attachment_26978" align="alignnone" width="696"] Image Source: Lyft[/caption] How does the search ranking feature function Once the user enters the search term, the results show in-line metadata and description of the table as well the last date when the table was updated. These results are chosen by fuzzy matching the entered text with a few metadata fields such as the table name, column name, table description and column descriptions. It uses an algorithm which is similar to Page Rank, where highly queried tables show up above, while those queried less are shown later in the search results. How does the detail page look like After selecting a result, users get to the detail page which shows the name of the table along with it’s manually curated description which is followed by the column list. A special blue arrow by a column indicates that it’s a popular column which encourages users to use it. On the right-hand pane, users can see who’s the owner, who are frequent users and a general profile of the data. [caption id="attachment_26980" align="alignnone" width="696"] Image source: Lyft[/caption] Further classification of metadata The team Lyft divided the metadata into a few categories and gave different access to each of the categories. Existence and other fundamental metadata This category includes name and description of table and fields, owners, last updated, etc. This metadata is available to everyone and anyone can access it. Richer metadata This category includes column stats and preview. This metadata is available to the users who have access to the data because these stats may have sensitive information which should be considered privileged. According to the team at Lyft, Amundsen has been successful at Lyft and has shown a high adoption rate and Customer Satisfaction (CSAT) score. Users can now easily discover more data in a shorter time. Amundsen can also be used to store, and tag all personal data within the organization which can help an organization remain compliant. To know more about this news, check out the official post by Lyft. Lyft acquires computer vision startup Blue Vision Labs, in a bid to win the self driving car race Uber and Lyft drivers strike in Los Angeles Uber open-sources Peloton, a unified Resource Scheduler

0
0
4001

Tech News - Data

Mozilla developers have built BugBug which uses machine learning to triage Firefox bugs

China will ban cryptomining to prevent wastage of resources

Horovod: an open-source distributed training framework by Uber for TensorFlow, Keras, PyTorch, and MXNet

Facebook AI introduces Aroma, a new code recommendation tool for developers

IEEE Spectrum: IBM Watson has a long way to go before it becomes an efficient AI doctor

Tech companies in EU to face strict regulation on Terrorist content: One hour take down limit; Upload filters and private Terms of Service

Fabrice Bellard, the creator of FFmpeg and QEMU, introduces a lossless data compressor which uses neural networks

Researchers introduce a new generative model, HoloGAN, that learns 3D representation from natural images

ML.NET 1.0 RC releases with support for TensorFlow models and much more!

Ian Goodfellow quits Google and joins Apple as a director of machine learning

Trending Topics

IBM CEO, Ginni Rometty, on bringing HR evolution with AI and its predictive attrition AI

Amazon Alexa is HIPAA-compliant: bigger leap in the health care sector

Google dissolves its Advanced Technology External Advisory Council in a week after repeat criticism on selection of members

ACLU (American Civil Liberties Union) file a complaint against the border control officers for violating the constitutional rights of an Apple employee

Lyft introduces Amundsen; a data discovery and metadata engine for its researchers and data scientists