Data | 0 articles | Tech News, Tutorials & Expert Insights

article-image-elvis-pranskevichus-on-limitations-in-sql-and-how-edgeql-can-help

10 May 2019

3 min read

Elvis Pranskevichus on limitations in SQL and how EdgeQL can help

10 May 2019

Structure Query Language (SQL), which was once considered “not a serious language” by its authors, has now become a dominant query language for relational databases in the industry. Its battle-tested solutions, stability, portability makes it a reliable choice to perform operations on your stored data. However, it does has its share of weak points and that’s what Elvis Pranskevichus, founder of EdgeDB, listed down in a post titled “We Can Do Better Than SQL” published yesterday. He explained that we now need a “better SQL” and further introduced the EdgeQL language, which aims to address the limitations in SQL. SQL’s shortcomings Following are some of the shortcomings Pranskevichus talks about in his post: “Lack of Orthogonality” Orthogonality is a property, which means if you make some changes in one component, it will have no side effect on any other component. In the case of SQL, it means, allowing users to combine a small set of primitive constructs in a small number of ways. Orthogonality leads to a more compact and consistent design and not having it will lead to language which has many exceptions and caveats. Giving an example, Pranskevichus wrote, “A good example of orthogonality in a programming language is the ability to substitute an arbitrary part of an expression with a variable, or a function call, without any effect on the final result.” SQL does not permit such type of generic substitution. “Lack of Compactness” One of the side effects of not being orthogonal is lack of compactness. SQL is also considered to be “verbose” because of its goal of being an English-like language for catering to “non-professions”. “However, with the growth of the language, this verbosity has contributed negatively to the ability to write and comprehend SQL queries. We learnt this lesson with COBOL, and the world has long since moved on to newer, more succinct programming languages. In addition to keyword proliferation, the orthogonality issues discussed above make queries more verbose and harder to read,” wrote Pranskevichus in his post. “Lack of Consistency” Pranskevichus further adds that SQL is inconsistent in terms of both syntax and semantics. Additionally, there is a problem of standardization as well as different database vendors implement their own version of SQL, which often end up being incompatible with other SQL variants. Introducing EdgeQL With EdgeQL, Pranskevichus aims to provide users a language which is orthogonal, consistent, and compact, and at the same time works with the generally applicable relational model. In short, he aims to make SQL better! EdgeQL basically considers every value a set and every expression a function over sets. This design of the language allows you yo factor any part of an EdgeQL expression into a view or a function without changing other parts of the query. It has no null and a missing value is considered to be an empty set, which comes with the advantage of having only two boolean logic states. Read Pranskevichus’s original post for more details on EdqeQL. Building a scalable PostgreSQL solution PostgreSQL security: a quick look at authentication best practices [Tutorial] How to handle backup and recovery with PostgreSQL 11 [Tutorial]

0
0
2634

article-image-tableau-day-highlights-augmented-analytics-tableau-prep-builder-and-conductor-and-more

Savia Lobo

10 May 2019

4 min read

‘Tableau Day’ highlights: Augmented Analytics, Tableau Prep Builder and Conductor, and more!

Savia Lobo

10 May 2019

4 min read

The Tableau community held a Tableau Day in Mumbai, India, yesterday. The community announced some upcoming and exciting developments in Tableau. Some of the highlights of the Tableau Day include an in-depth explanation of the new Tableau Prep Builder and Conductor, how Tableau plans to move on to Augmented Analytics, and many others. The conference also included a customer story from Nishtha Sharma, Manager at Times Internet, who shared how Tableau helped Times Internet in optimizing their sales, revenue, managing cost per customer, and business predictions with the help of Tableau Dashboards. She further said, Times Internet was solving around 10 business problems with 7 dashboards initially; however, due to success with Tableau initially, they are now solving close to 30 business cases with 15 dashboards. Let us have a look at some of the highlights below. Augmented Analytics: The next step for Tableau Varun Tandon, Tableau Solution consultant explained how Tableau is adopting intelligent or Augmented Analytics. Tableau may be moving into adopting augmented analytics for its platform where ML and AI can be used to enhance data access and data quality, uncover previously hidden insights, suggest analyses, deliver predictive analytics and suggest actions, and a lot of other tasks. A lot of users came up with questions and speculations based on Tableau’s acquisition of Empirical Systems last year and whether Ask Data, Tableau’s new natural language capability, a feature included in Tableau 2018.2, was a result of the same. The representatives confirmed the acquisition and also mentioned that Tableau is planning on building analytics and augmented analytics within Tableau without the need for additional third-party add-ons. However, they did not clarify if Ask Data was a result of Empirical System’s acquisition. With Empirical’s NLP module, Tableau users may easily gain insights, make better data-driven decisions, and explore many more features without knowledge of data science or query languages. Doug Henschen, a technology analyst at ConstellationR in his report, “Tableau Advances the Era of Smart Analytics” explored the smart features that Tableau Software has introduced and is investing in and how these capabilities will benefit Tableau customers. Creating a single Hub for data from various sources The conference explained in detail with examples of how Tableau can be used as a single Hub for data coming from various sources such as Netsuite, Excel, Salesforce, and so on. New features on Tableau Prep Builder and Conductor Tableau’s new Data Prep Builder and Conductor, which saves massive data preparation time, was also demonstrated and its new features were explained in detail, in this conference. Shilpa Bhatia, a Customer Consultant at Tableau Software, conducted this session. Questions were asked if Tableau Prep Builder and Conductor would replace ETL. The representatives said that Data Prep does a good job with data preparation; however, users should not confuse it with ETL. They have called the Tableau Prep Builder and Conductor, a Mini ETL. Tableau is also coming up with monthly updates since the tool is still evolving and it will continue for the near future. A question on the ability to pull the data from Data Prep to Jupyter notebook for building data frames was also asked. However, even this isn’t possible with the Prep Prep Builder and Conductor. They said Data Prep is extremely simple to use; however, it is a resource heavy tool and a dedicated machine with more than 16 GB RAM to will be needed to avoid system lag for large datasets. The self-service mode in Tableau Jayen Thakker, Sales Consultant at Tableau explained how one can go beyond dashboards with Tableau. He said, with the help of Tableau’s self-service mode, users can explore and build dashboards on their own without the need of waiting for the developer to build it for them. Upcoming Tableau versions The conference also revealed that Tableau 2019.2 is currently in Beta 2 and is expected to be released by the next month. Also, there will be a Beta 3 version before the final release. Each release of Tableau includes around 100 to 150 changes, and a couple of changes were also discussed including Spatial data (MakePoint and MakeLine), some next steps on how it will move beyond 'Ask Data' and will include advanced analytics and AI features, and so on. Tableau is also working on serving people who need more traditional reporting, the representatives mentioned. To know more about the ‘Tableau Day’ highlights from Mumbai, watch this space or visit Tableau’s official website. Alteryx vs. Tableau: Choosing the right data analytics tool for your business Tableau 2019.1 beta announced at Tableau Conference 2018 How to do data storytelling well with Tableau [Video]

0
0
4195

article-image-artist-holly-herndon-releases-an-album-featuring-an-artificial-intelligence-musician

Richard Gall

10 May 2019

6 min read

Artist Holly Herndon releases an album featuring an artificial intelligence 'musician'

Richard Gall

10 May 2019

6 min read

0
0
3385

article-image-singapore-passes-controversial-bill-that-criminalizes-publishing-fake-news

Vincy Davis

10 May 2019

3 min read

Singapore passes controversial bill that criminalizes publishing “fake” news

Vincy Davis

10 May 2019

3 min read

Yesterday, Singapore passed a law criminalizing publication of fake news which will allow the government to block and order the removal of such content. The bill ‘The Protection from Online Falsehoods and Manipulation’ was passed by a vote of 72-9 in the Singapore parliament. This law would allow the government to demand corrections, order the removal of content, or block websites deemed to be propagating falsehoods contrary to the public interest. Two months ago, Russia passed a new law which will allow the government to punish individuals and online media for spreading “fake” news and information which disrespects the state. In recent months, other countries like France and Germany have already passed tough laws against fake news or hate speech. Singapore is ranked 151 out of 180 countries in this year's World Press Freedom Index. What does the Bill cover? ‘The Protection from Online Falsehoods and Manipulation Bill’ will give the Singapore government the power to ban fake news which can be detrimental to Singapore or can influence elections. The government can demand removal of such hurtful content or they can even block it. Offenders could face a jail term of up to 10 years and hefty fines. Last month during a visit to Malaysia, Singapore Prime Minister Lee Hsien Loong had said, “fake news was a serious problem and other countries including France, Germany and Australia were legislating to combat it”. He added Singapore’s proposed laws “will be a significant step forward” and “We’ve deliberated on this now for almost two years. What we have done has worked for Singapore, it is our objective to continue to do things which will work for Singapore.” Reactions to the Bill Under the proposed legislation, all of Singapore government's ministers will be handed powers to demand corrections or order websites to be blocked if they are found to be propagating “falsehoods” contrary to the public interest. Very few people have praised the law, as there are many who believe that this law will target ‘Free Speech’ more than ‘Fake News’. Phil Robertson, deputy Asia director at Human Rights said, “Singapore’s new 'fake news' law is a disaster for online expression by ordinary #Singaporeans, and a hammer blow against the independence of many online news portals they rely on to get real news about their country beyond the ruling People's Action Party political filter”. He added, “You’re basically giving the autocrats another weapon to restrict speech, and speech is pretty restricted in the region already.” Social media firms have strongly criticized the law which hurt freedom of speech by forcing social media platforms to censor users in order to avoid potential fines. Google, Facebook, and Twitter have voiced their reservations regarding the ‘Fake News bill’. According to Reuters news, Google which has its Asia headquarters in Singapore, said it was "concerned that this law will hurt innovation" and that "how the law is implemented matters." Though authorities around the world are of the opinion that laws to restrict ‘Fake News’ are the need of the hour. It would be good if they would have decide what’s worse, some fake news on the web or some big daddy deciding what is right for the people. To know more details about the bill, read the document release. Facebook hires top EEF lawyer and Facebook critic as Whatsapp privacy policy manager Will Facebook enforce it’s updated “remove, reduce, and inform” policy to curb fake news and manage problematic content? OpenAI’s new versatile AI model, GPT-2 can efficiently write convincing fake news from just a few words

0
0
1141

article-image-linux-forms-urban-computing-foundation-open-source-tools-build-autonomous-vehicles-smart-infrastructure

Fatema Patrawala

09 May 2019

3 min read

Linux forms Urban Computing Foundation: Set of open source tools to build autonomous vehicles and smart infrastructure

Fatema Patrawala

09 May 2019

3 min read

The Linux Foundation, nonprofit organization enabling mass innovation through open source, on Tuesday announced the formation of the Urban Computing Foundation (UCF). UCF will accelerate open source software to improve mobility, safety, road infrastructure, traffic congestion and energy consumption in connected cities. UCF’s mission is to enable developers, data scientists, visualization specialists and engineers to improve urban environments, human life quality, and city operation systems to build connected urban infrastructure. The founding members of UCF are Facebook, Google, IBM, UC San Diego, Interline Technologies, Uber etc. The executive director of Linux Foundation, Jim Zemlin spoke to Venturebeat, and said the Foundation will adopt an open governance model developed by the Technical Advisory Council (TAC), which will include technical and IP stakeholders in urban computing who’ll guide its work through projects by review and curation. The intent, added Zemlin, is to provide platforms to developers who seek to address traffic congestion, pollution, and other problems plaguing modern metros. Here’s the list of TAC members: Drew Dara-Abrams, principal, Interline Technologies Oliver Fink, director Here XYZ, Here Technologies Travis Gorkin, engineering manager of data visualization, Uber Shan He, project leader of Kepler.gl, Uber Randy Meech, CEO, StreetCred Labs Michal Migurski, engineering manager of spatial computing, Facebook Drishtie Patel, product manager of maps, Facebook Paolo Santi, senior researcher, MIT Max Sills, attorney, Google On Tuesday, Facebook announced their participation as a founding member of the Urban Computing Foundation (UCF). https://twitter.com/fb_engineering/status/1125783991452180481 Facebook mentions in its post that, “We are using our expertise — including a predictive model for mapping electrical grids, disaster maps , and more accurate population density maps — to improve access to this type of technology”. Further Facebook mentions that UCF will establish a neutral space for the critical work. It will include adapting geospatial and temporal machine learning techniques for urban environments and developing simulation methodologies for modeling and predicting citywide phenomena. Uber also reported about their joining and announced their contribution of Kepler.gl as the initiative’s first official project. Kepler is Uber’s open source, no-code geospatial analysis tool for creating large-scale data sets. It was released in 2018, and is currently used by Airbnb, Atkins Global, Cityswifter, Lime, Mapbox, Sidewalk Labs, and UBILabs, among others to generate visualizations of location data. While all of this set a path towards making of smarter cities, it also raises an alarm to another way of violating privacy and mishandling user data as per the history in tech. Moreover when recently Amnesty International in Canada regarded the Google Sidewalk Labs project in Toronto to normalize mass surveillance and a direct threat to human rights. Questions are raised as to the tech companies forming foundation to address traffic congestion issue but not to address the privacy violation or online extremism. https://twitter.com/shannoncoulter/status/1126199285530238976 The Linux Foundation announces the CHIPS Alliance project for deeper open source hardware integration Mapzen, an open-source mapping platform, joins the Linux Foundation project Uber becomes a Gold member of the Linux Foundation

0
0
2663

article-image-sherin-thomas-explains-how-to-build-a-pipeline-in-pytorch-for-deep-learning-workflows

Packt Editorial Staff

09 May 2019

8 min read

Sherin Thomas explains how to build a pipeline in PyTorch for deep learning workflows

Packt Editorial Staff

09 May 2019

8 min read

A typical deep learning workflow starts with ideation and research around a problem statement, where the architectural design and model decisions come into play. Following this, the theoretical model is experimented using prototypes. This includes trying out different models or techniques, such as skip connection, or making decisions on what not to try out. PyTorch was started as a research framework by a Facebook intern, and now it has grown to be used as a research or prototype framework and to write an efficient model with serving modules. The PyTorch deep learning workflow is fairly equivalent to the workflow implemented by almost everyone in the industry, even for highly sophisticated implementations, with slight variations. In this article, we explain the core of ideation and planning, design and experimentation of the PyTorch deep learning workflow. This article is an excerpt from the book PyTorch Deep Learning Hands-On by Sherin Thomas and Sudhanshi Passi. This book attempts to provide an entirely practical introduction to PyTorch. This PyTorch publication has numerous examples and dynamic AI applications and demonstrates the simplicity and efficiency of the PyTorch approach to machine intelligence and deep learning. Ideation and planning Usually, in an organization, the product team comes up with a problem statement for the engineering team, to know whether they can solve it or not. This is the start of the ideation phase. However, in academia, this could be the decision phase where candidates have to find a problem for their thesis. In the ideation phase, engineers brainstorm and find the theoretical implementations that could potentially solve the problem. In addition to converting the problem statement to a theoretical solution, the ideation phase is where we decide what the data types are and what dataset we should use to build the proof of concept (POC) of the minimum viable product (MVP). Also, this is the stage where the team decides which framework to go with by analyzing the behavior of the problem statement, available implementations, available pretrained models, and so on. This stage is very common in the industry, and I have come across numerous examples where a well-planned ideation phase helped the team to roll out a reliable product on time, while a non-planned ideation phase destroyed the whole product creation. Design and experimentation The crucial part of design and experimentation lies in the dataset and the preprocessing of the dataset. For any data science project, the major timeshare is spent on data cleaning and preprocessing. Deep learning is no exception from this. Data preprocessing is one of the vital parts of building a deep learning pipeline. Usually, for a neural network to process, real-world datasets are not cleaned or formatted. Conversion to floats or integers, normalization and so on, is required before further processing. Building a data processing pipeline is also a non-trivial task, which consists of writing a lot of boilerplate code. For making it much easier, dataset builders and DataLoader pipeline packages are built into the core of PyTorch. The dataset and DataLoader classes Different types of deep learning problems require different types of datasets, and each of them might require different types of preprocessing depending on the neural network architecture we use. This is one of the core problems in deep learning pipeline building. Although the community has made the datasets for different tasks available for free, writing a preprocessing script is almost always painful. PyTorch solves this problem by giving abstract classes to write custom datasets and data loaders. The example given here is a simple dataset class to load the fizzbuzz dataset, but extending this to handle any type of dataset is fairly straightforward. PyTorch's official documentation uses a similar approach to preprocess an image dataset before passing that to a complex convolutional neural network (CNN) architecture. A dataset class in PyTorch is a high-level abstraction that handles almost everything required by the data loaders. The custom dataset class defined by the user needs to override the __len__ and __getitem__ functions of the parent class, where __len__ is being used by the data loaders to determine the length of the dataset and __getitem__ is being used by the data loaders to get the item. The __getitem__ function expects the user to pass the index as an argument and get the item that resides on that index: from dataclasses import dataclassfrom torch.utils.data import Dataset, DataLoader@dataclass(eq=False)class FizBuzDataset(Dataset): input_size: int start: int = 0 end: int = 1000 def encoder(self,num): ret = [int(i) for i in '{0:b}'.format(num)] return[0] * (self.input_size - len(ret)) + ret def __getitem__(self, idx): x = self.encoder(idx) if idx % 15 == 0: y = [1,0,0,0] elif idx % 5 ==0: y = [0,1,0,0] elif idx % 3 == 0: y = [0,0,1,0] else: y = [0,0,0,1] return x,y def __len__(self): return self.end - self.start The implementation of a custom dataset uses brand new dataclasses from Python 3.7. dataclasses help to eliminate boilerplate code for Python magic functions, such as __init__, using dynamic code generation. This needs the code to be type-hinted and that's what the first three lines inside the class are for. You can read more about dataclasses in the official documentation of Python (https://docs.python.org/3/library/dataclasses.html). The __len__ function returns the difference between the end and start values passed to the class. In the fizzbuzz dataset, the data is generated by the program. The implementation of data generation is inside the __getitem__ function, where the class instance generates the data based on the index passed by DataLoader. PyTorch made the class abstraction as generic as possible such that the user can define what the data loader should return for each id. In this particular case, the class instance returns input and output for each index, where, input, x is the binary-encoder version of the index itself and output is the one-hot encoded output with four states. The four states represent whether the next number is a multiple of three (fizz), or a multiple of five (buzz), or a multiple of both three and five (fizzbuzz), or not a multiple of either three or five. Note: For Python newbies, the way the dataset works can be understood by looking first for the loop that loops over the integers, starting from zero to the length of the dataset (the length is returned by the __len__ function when len(object) is called). The following snippet shows the simple loop: dataset = FizBuzDataset()for i in range(len(dataset)): x, y = dataset[i]dataloader = DataLoader(dataset, batch_size=10, shuffle=True, num_workers=4)for batch in dataloader: print(batch) The DataLoader class accepts a dataset class that is inherited from torch.utils.data.Dataset. DataLoader accepts dataset and does non-trivial operations such as mini-batching, multithreading, shuffling, and so on, to fetch the data from the dataset. It accepts a dataset instance from the user and uses the sampler strategy to sample data as mini-batches. The num_worker argument decides how many parallel threads should be operating to fetch the data. This helps to avoid a CPU bottleneck so that the CPU can catch up with the GPU's parallel operations. Data loaders allow users to specify whether to use pinned CUDA memory or not, which copies the data tensors to CUDA's pinned memory before returning it to the user. Using pinned memory is the key to fast data transfers between devices, since the data is loaded into the pinned memory by the data loader itself, which is done by multiple cores of the CPU anyway. Most often, especially while prototyping, custom datasets might not be available for developers and in such cases, they have to rely on existing open datasets. The good thing about working on open datasets is that most of them are free from licensing burdens, and thousands of people have already tried preprocessing them, so the community will help out. PyTorch came up with utility packages for all three types of datasets with pretrained models, preprocessed datasets, and utility functions to work with these datasets. This article is about how to build a basic pipeline for deep learning development. The system we defined here is a very common/general approach that is followed by different sorts of companies, with slight changes. The benefit of starting with a generic workflow like this is that you can build a really complex workflow as your team/project grows on top of it. Build deep learning workflows and take deep learning models from prototyping to production with PyTorch Deep Learning Hands-On written by Sherin Thomas and Sudhanshu Passi. F8 PyTorch announcements: PyTorch 1.1 releases with new AI tools, open sourcing BoTorch and Ax, and more Facebook AI open-sources PyTorch-BigGraph for faster embeddings in large graphs Top 10 deep learning frameworks

0
0
8120

article-image-msbuild2019-microsoft-launches-new-products-to-secure-elections-and-political-campaigns

Sugandha Lahoti

07 May 2019

2 min read

#MSBuild2019: Microsoft launches new products to secure elections and political campaigns

Sugandha Lahoti

07 May 2019

2 min read

It seems big tech giants are getting pretty serious about protecting election integrity and adopting data protection measures. At the ongoing Microsoft Build 2019 developer conference, CEO Satya Nadella announced ElectionGuard, a free open-source software development kit (SDK) as an extension of Microsoft’s Defending Democracy Program. ElectionGuard SDK It is an open-source SDK and voting system reference implementation that was developed in partnership with Galois. This SDK will provide voting system vendors with the ability to enable end-to-end verifiability and improved risk-limiting audit capabilities for elections in their systems. It will be offered free to voting system vendors either to integrate into their existing systems or to use to build all-new election systems. “One of the things we want to ensure is real transparency and verifiability in election systems. And so this is an open source project that will be alive on GitHub by the end of this month, which will even bring some new technology from Microsoft Research around homomorphic encryption, so that you can have the software stack that can modernize all of the election infrastructure everywhere in the world,” CEO Satya Nadella said onstage today at Microsoft’s annual Build developer conference in Seattle. The ElectionGuard SDK and reference implementation will be available on GitHub in June, just ahead of the EU elections. Microsoft 365 for Campaigns Micrsoft365 for Campaigns provides security capabilities of Microsoft 365 Business to political parties and individual candidates. M365 for Campaigns will be rolled out to customers this summer for $5 per user per month. Any campaign using M365 for Campaigns will have free access to Microsoft’s AccountGuard service. Microsoft claims it'll be affordable, naturally, and "preconfigured to optimize for the unique operating environments campaigns face." Starting next month, M365 for Campaigns will be available for all federal election campaign candidates, federal candidate committees, and national party committees in the United States Microsoft Build is in its 6th year and will continue till 8th May. The conference hosts over 6,000 attendees with early 500 student-age developers and over 2,600 customers and partners in attendance. Watch this space for more coverage of Microsoft Build 2019. Microsoft introduces Remote Development extensions to make remote development easier on VS Code Docker announces a collaboration with Microsoft’s .NET at DockerCon 2019 How Visual Studio Code can help bridge the gap between full-stack development and DevOps [Sponsered by Microsoft]

0
0
2390

article-image-openai-two-new-versions-and-the-output-dataset-of-gpt-2-out

Vincy Davis

07 May 2019

3 min read

OpenAI: Two new versions and the output dataset of GPT-2 out!

Vincy Davis

07 May 2019

3 min read

Today, OpenAI have released the versions of GPT-2, which is a new AI model. GPT-2 is capable of generating coherent paragraphs of text without needing any task-specific training. The release includes a medium 345M version and a small 117M version of GPT-2. They have also shared the 762M and 1.5B versions with partners in the AI and security communities who are working to improve societal preparedness for large language models. The earlier version release of GPT was in the year 2018. In February 2019, Open-AI had made an announcement about GPT-2 with many samples and policy implications. Read More: OpenAI’s new versatile AI model, GPT-2 can efficiently write convincing fake news from just a few words The team at OpenAI has decided on a staged release of GPT-2. Staged release will have the gradual release of family models over time. The reason behind the staged release of GPT-2 is to give people time to assess the properties of these models, discuss their societal implications, and evaluate the impacts of release after each stage. The 345M parameter version of GPT-2 has improved performance relative to the 117M version, though it does not offer much ease of generating coherent text. Also it would be difficult to misuse the 345M version. Many factors like ease of use for generating coherent text, the role of humans in the text generation process, the likelihood and timing of future replication and publication by others, evidence of use in the wild and expert-informed inferences about unobservable uses, etc were considered while releasing this staged 345M version. The team is hopeful that the ongoing research on bias, detection, and misuse will boost them to publish larger models and in six months, they will share a fuller analysis of language models’ societal implications and the heuristics for release decisions. The team at OpenAI is looking for partnerships with academic institutions, non-profits, and industry labs which will focus on increasing societal preparedness for large language models. They are also open to collaborating with researchers working on language model output detection, bias, and publication norms, and with organizations potentially affected by large language models. The output dataset contains GPT-2 outputs from all 4 model sizes, with and without top-k truncation, as well as a subset of the WebText corpus used to train GPT-2. The dataset features approximately 250,000 samples per model/hyperparameter pair, which will be sufficient to help a wider range of researchers perform quantitative and qualitative analysis. To know more about the release, head over to the official release announcement. OpenAI introduces MuseNet: A deep neural network for generating musical compositions OpenAI researchers have developed Sparse Transformers, a neural network which can predict what comes OpenAI Five bots destroyed human Dota 2 players this weekend

0
0
4392

article-image-duckduckgo-proposes-do-not-track-act-of-2019-to-require-sites-to-respect-dnt-browser-setting

Sugandha Lahoti

07 May 2019

3 min read

DuckDuckGo proposes “Do-Not-Track Act of 2019” to require sites to respect DNT browser setting

Sugandha Lahoti

07 May 2019

3 min read

DuckDuckGo, the browser known for its privacy protection policies, has proposed draft legislation which will require sites to respect the Do Not Track browser setting. Called, the “Do-Not-Track Act of 2019”, this legislation will mandate websites to not track people if they have enabled the DNT signal on their browsers. Per a recent study conducted by DuckDuckGo, a quarter of people have turned on this setting, and most were unaware big sites do not respect it. [box type="shadow" align="" class="" width=""] Do-Not-Track Signal” means a signal sent by a web browser or similar User Agent that conveys a User’s choice regarding online Tracking, reflects a deliberate choice by the user. It complies with the latest Tracking Preference Expression (DNT) specification published by the World Wide Web Consortium (W3C)[/box] DuckDuckGo’s act just comes days after Google announced more privacy control to its users. Last week, Google launched a new feature allowing users to delete all or part of the location history and web and app activity data, manually. It has a time limit for how long you want your activity data to be saved: 3 or 18 months, before deleting it automatically. However, it does not have an option to not store history automatically. DuckDuckGo’s proposed 'Do-Not-Track Act of 2019' legislation details the following points: No third-party tracking by default. Data brokers would no longer be legally able to use hidden trackers to slurp up your personal information from the sites you visit. And the companies that deploy the most trackers across the web — led by Google, Facebook, and Twitter — would no longer be able to collect and use your browsing history without your permission. No first-party tracking outside what the user expects. For example, if you use Whatsapp, its parent company (Facebook) wouldn't be able to use your data from Whatsapp in unrelated situations (like for advertising on Instagram, also owned by Facebook). As another example, if you go to a weather site, it could give you the local forecast, but not share or sell your location history. The legislation would have exceptions for debugging, auditing, security, non-commercial research, and journalism. However, each of these exceptions would only apply if a site adopts strict data-minimization practices. These include using the least amount of personal information needed, and anonymizing it when possible. Also, restrictions would only come into play only if a consumer has turned on the Do Not Track setting in their browser settings. In case of violation of the Do-Not-Track Act of 2019, DuckDuckGo proposes an amount no less than $50,000 and no more than $10,000,000 or 2% of an Organization’s annual revenue, whichever is greater, can be charged by the legislators. If the act passes into law, sites would be required to cease certain user tracking methods, which means fewer data available to inform marketing and advertising campaigns. The proposal is still quite far from being turning into law but presidential candidate Elizabeth Warren’s recent proposal to regulate “big tech companies”, may give it a much-needed boost. Twitter users complimented the act. https://twitter.com/Bendineliot/status/1123579280892538881 https://twitter.com/jmhaigh/status/1123574469950414848 https://twitter.com/n0ahrabbit/status/1123572013153439745 For the full text, download the proposed Do-Not-Track Act of 2019. DuckDuckGo now uses Apple MapKit JS for its map and location-based searches DuckDuckGo chooses to improve its products without sacrificing user privacy ‘Ethical mobile operating system’ /e/, an alternative for Android and iOS, is in beta

0
0
3409

article-image-an-unsupervised-deep-neural-network-cracks-250-million-protein-sequences-to-reveal-biological-structures-and-functions

Vincy Davis

07 May 2019

4 min read

An unsupervised deep neural network cracks 250 million protein sequences to reveal biological structures and functions

Vincy Davis

07 May 2019

4 min read

One of the goals for artificial intelligence in biology is the creation of controllable predictive and generative models that can read and generate biology in its native language. Artificial neural networks with proven pattern recognition capabilities, have been utilized in many areas of bioinformatics. Accordingly, research is necessary into methods that can learn intrinsic biological properties directly from protein sequences, which can be transferred to prediction and generation. Last week, Alexander Rives and Rob Fergus from Dept. of Computer Science, New York University, Siddharth Goyal, Joshua Meier, Demi Guo, Myle Ott, C. Lawrence Zitnick and Jerry Ma from Facebook AI Research team together published a paper titled ‘Biological Structure And Function Emerge From Scaling Unsupervised Learning to 250 Million Protein Sequences’. This paper investigates scaling high-capacity neural networks to extract general and transferable information about proteins from raw sequences. The next-generation sequencing (NGS) have revolutionized the biological field. It has also helped in performing a wide variety of applications and study biological systems at a detailed level. Recently due to reductions in the cost of this technology, there has been exponential growth in the size of biological sequence datasets. When data is sampled across diverse sequences, it helps in studying predictive and generative techniques for biology using artificial intelligence. In this paper the team has investigated deep learning across evolution at the scale of the largest available protein sequence databases. What does the research involve Researchers have applied self-supervision to the problem of understanding protein sequences and explore information about representation learning. They have trained a neural network by predicting masked amino acids. For training the neural network, a wide range of datasets containing 250M protein sequences with 86 billion amino acids are used during the research. The resulting model maps raw sequences to representations of biological properties without any prior domain knowledge. The neural network represents the identity of each amino acid in its input and output embeddings. The space of representations learned from sequences provides biological structure information at many levels, including that of amino acids, proteins, groups of orthologous genes, and species. Information about secondary and tertiary structure is internalized and represented within the network in a generalizable form. Observations from the research Finally the paper states that it is possible to adapt networks that have been trained on evolutionary data which will give results using only features that have been learned from sequences i.e., without any prior knowledge. It was also observed that the higher capacity models which were trained, were not fit for the 250M sequences, due to insufficient model capacity. The researchers are certain that using trained network architectures, along with predictive models will help in generating and optimizing new sequences for desired functions. It will also work for sequences that have not been seen before in nature but that are biologically active. They have tried to use unsupervised learning to recover representations that can map multiple levels of biological granularity. https://twitter.com/soumithchintala/status/1123236593903423490 But the result of the paper does not satisfy the community completely. Some are of the opinion that the paper is incomprehensible and has left some information unarticulated. For example, it is not specified which representation of biological properties does the model map. A user on Reddit commented that, “Like some of the other ML/AI posts that made it to the top page today, this research too does not give any clear way to reproduce the results. I looked through the pre-print page as well as the full manuscript itself. Without reproducibility and transparency in the code and data, the impact of this research is ultimately limited. No one else can recreate, iterate, and refine the results, nor can anyone rigorously evaluate the methodology used”. Another user added, “This is cool, but would be significantly cooler if they did some kind of biological follow up. Perhaps getting their model to output an "ideal" sequence for a desired enzymatic function and then swapping that domain into an existing protein lacking the new function”. Create machine learning pipelines using unsupervised AutoML [Tutorial] Rigetti develops a new quantum algorithm to supercharge unsupervised Machine Learning RIP Nils John Nilsson; an AI visionary, inventor of A* algorithm, STRIPS automatic planning system and many more

0
0
2814

article-image-palantirs-software-was-used-to-separate-families-in-a-2017-operation-reveals-mijente

Savia Lobo

06 May 2019

4 min read

Palantir’s software was used to separate families in a 2017 operation reveals Mijente

Savia Lobo

06 May 2019

4 min read

Documents released this week, reveals that the data mining firm, Palantir was responsible for 2017 operation that targeted and arrested family members of children crossing the border alone. The documents show a huge contrast to what Palantir said its software was doing. This discrepancy was first identified by Mijente, an advocacy organization that has closely tracked Palantir’s murky role in immigration enforcement. The documents confirm that “the role Palantir technology played in facilitating hundreds of arrests, only a small fraction of which led to criminal prosecutions”, The Intercept reports. Palantir, a software firm founded by Peter Thiel, one of President Trump’s most vocal supporters in Silicon Valley, develops software that helps agents analyze massive amounts of personal data and builds profiles for prosecution and arrest. Also, in May 2018, Amazon employees, in a letter to Jeff Bezos, protested against the sale of its facial recognition tech to Palantir where they “refuse to contribute to tools that violate human rights”, citing the mistreatment of refugees and immigrants by ICE. Read Also: Amazon addresses employees dissent regarding the company’s law enforcement policies at an all-staff meeting, in a first Palantir earlier said it was not involved with the part of ICE, which was strictly devoted to deportations and the enforcement of immigration laws. Whereas Palantir’s $38 million contract with Homeland Security Investigations, or HSI, a component of ICE had a far broader criminal enforcement mandate. https://twitter.com/ConMijente/status/1124056308943138834 The 2017 ICE operation was designed to dissuade children from joining family members in the United States by targeting parents and sponsors for arrest. According to The Intercept, “Documents obtained through Freedom of Information Act litigation and provided to The Intercept show that this claim, that Palantir software is strictly involved in criminal investigations as opposed to deportations, is false.” As part of the operation, ICE arrested 443 people solely for being undocumented. For all this, Palantir’s software was used throughout, which helped agents build profiles of immigrant children and their family members for the prosecution and arrest of any undocumented person they encountered in their investigation. https://twitter.com/ConMijente/status/1124056314106322944 “The operation was underway as the Trump administration detained hundreds of children shelters throughout the country. Unaccompanied children were taken by border agents, sent to privately-run facilities, and held indefinitely. Any undocumented parent or family member who came forward to claim children were arrested by ICE for deportation. More children were kept in detention longer, as relatives stopped coming forward”, Mijente reports. Mijente further mentions in their post, “Mijente is urging Palantir to drop its contract with ICE and stop providing software to agencies that aid in tracking, detaining, and deporting migrants, refugees, and asylum seekers. As Palantir plans its initial public offering, Mijente is also calling on investors not to invest in a company that played a key role in family separation.” The seven-page document, titled “Unaccompanied Alien Children Human Smuggling Disruption Initiative,” details how one of Palantir’s software solutions, Investigative Case Management (ICM) can be used by agents stationed at the border to build cases of unaccompanied children and their families.Mijente further mentions, “This document is further proof that Palantir’s software directly aids in prosecutions for deportation carried out by HSI agents. Not only are HSI agents involved in deportations in the interior, but they are also actively aiding border agents by investigating and prosecuting relatives of unaccompanied children hoping to join their families.” Jesse Franzblau, senior policy analyst for the National Immigrant Justice Center, said in an email to The Intercept, “The detention and deportation machine is not only driven by hate, but also by profit. Palantir profits from its contract with ICE to help the administration target parents and sponsors of children, and also pays Amazon to use its servers in the process. The role of private tech behind immigration enforcement deserves more attention, particularly with the growing influence of Silicon Valley in government policymaking. “Yet, Palantir’s executives have made no move to cancel their work with ICE. Its founder, Alex Karp, said he’s “proud” to work with the United States government. Last year, he reportedly ignored employees who “begged” him to end the firm’s contract with ICE”, the Mijente report mentions. To know more about this news in detail head over to the official report. Lerna relicenses to ban major tech giants like Amazon, Microsoft, Palantir from using its software as a protest against ICE Free Software Foundation updates their licensing materials, adds Commons Clause and Fraunhofer FDK AAC license “We can sell dangerous surveillance systems to police or we can stand up for what’s right. We can’t do both,” says a protesting Amazon employee

0
0
2590

article-image-deeplearning4j-1-0-0-beta4-released-with-full-multi-datatype-support-new-attention-layers-and-more

Vincy Davis

03 May 2019

3 min read

Deeplearning4J 1.0.0-beta4 released with full multi-datatype support, new attention layers, and more!

Vincy Davis

03 May 2019

3 min read

Yesterday, Deep Learning For Java (DL4J) released their new beta version, DL4J 1.0.0-beta4. The main highlight of this version is the full multi-datatype support for ND4J and DL4J, unlike past releases. The previous version deeplearning4j-1.0.0-beta3 was released last year. This 1.0.0-beta4 version also includes the addition of MKL-DNN support, new attention layers, and many more along with optimizations and bug fixes. What’s new in DL4J 1.0.0-beta4? Full multi-datatype support In past releases, all N-Dimensional arrays in ND4J were limited to a single datatype, set globally. Now, arrays of all datatypes may be used simultaneously. The supported datatypes are Double, Float, Half, Long, Int, Short, Ubyte, Byte, Bool and UTF8. CUDA Support CUDA 10.1 support has been added and CUDA 9.0 support has been dropped. DL4J 1.0.0-beta4 also supports CUDA versions 9.2, 10.0 and 10.1. Mac (OSX) CUDA binaries are no longer provided. However, support for Linux and Windows CUDA, and OSX CPU(x86_64) is still available. Memory Management Changes In DL4J 1.0.0-beta4, the periodic garbage collection is disabled by default; instead, garbage collection (GC) will be called only when it is required to reclaim memory from arrays that are allocated outside of workspaces. Deeplearning4J: Bug Fixes and Optimizations cuDNN helpers will no longer attempt to fall back on built-in layer implementations if an out-of-memory exception is thrown. Batch normalization global variance reparameterized to avoid underflow and zero/negative variance in some cases during distributed training. A bug where dropout instances were incorrectly shared between layers when using transfer learning with dropout has been fixed. An issue where tensor Along Dimension could result in an incorrect array order for edge cases and hence exceptions in LSTMs has been fixed. ND4J and SameDiff Features and Enhancements Removed reliance on periodic garbage collection calls for handling memory management of out-of-workspace (detached) INDArrays. New additions - TensorFlowImportValidator tool, INDArray.close() method, Nd4j.createFromNpzFile method, support for importing BERT models into SameDiff, SameDiff GraphTransformUtil, etc have been added. Evaluation, RegressionEvaluation etc. now support 4d (CNN segmentation) data formats. Bug Fixes and Optimizations The bug with InvertMatrix.invert() with [1,1] shape matrices has been fixed. Edge case bug for Updater instances with length 1 state arrays has been fixed. In SameDiff, ‘gradients’ are now no longer defined for non-floating-point variables, and variables that aren’t required to calculate loss or parameter gradients, thus the gradient calculation performance has improved. To know more about the release, check the detailed release notes. Deeplearning4j 1.0.0-alpha arrives! 7 things Java programmers need to watch for in 2019 Deep Learning Indaba presents the state of Natural Language Processing in 2018

0
0
2835

article-image-oakland-privacy-advisory-commission-lay-out-privacy-principles-for-oaklanders-and-propose-ban-on-facial-recognition-tech

Amrata Joshi

30 Apr 2019

5 min read

Oakland Privacy Advisory Commission lay out privacy principles for Oaklanders and propose ban on facial recognition tech

Amrata Joshi

30 Apr 2019

5 min read

0
0
2349

article-image-googles-sidewalk-lab-smart-city-project-threatens-privacy-and-human-rights-amnesty-intl-ca-says

Fatema Patrawala

30 Apr 2019

6 min read

Google’s Sidewalk Lab smart city project threatens privacy and human rights: Amnesty Intl, CA says

Fatema Patrawala

30 Apr 2019

6 min read

Sidewalk Toronto, a joint venture between Sidewalk Labs, which is owned by Google parent company Alphabet Inc., and Waterfront Toronto, is proposing a high-tech neighbourhood called Quayside for the city’s eastern waterfront. In March 2017, Waterfront Toronto had shared a Request for proposal for this project with the Sidewalk Labs team. It ultimately got approval by Oct 2017 and is currently led by Eric Schmidt, Alphabet Inc CEO and Daniel Doctoroff, Sidewalk Labs CEO. As per reports from Daneilla Barreto, a digital activism coordinator for Amnesty International Canada, the project will normalize the mass surveillance and is a direct threat to human rights. https://twitter.com/AmnestyNow/status/1122932137513164801 The 12-acre smart city, which will be located between East Bayfront and the Port Lands, promises to tackle the social and policy challenges affecting Toronto: affordable housing, traffic congestion and the impacts of climate change. Imagine self-driving vehicles shuttling you around a 24/7 neighbourhood featuring low-cost, modular buildings that easily switch uses based on market demand. Picture buildings heated or cooled by a thermal grid that doesn’t rely on fossil fuels, or garbage collection by industrial robots. Underpinning all of this is a network of sensors and other connected technology that will monitor and track environmental and human behavioural data. That last part about tracking human data has sparked concerns. Much ink has been spilled in the press about privacy protections and the issue has been raised repeatedly by citizens in two of four recent community consultations held by Sidewalk Toronto. They have proposed to build the waterfront neighbourhood from scratch, embed sensors and cameras throughout and effectively create a “digital layer”. This digital layer may result monitoring actions of individuals and collection of their data. In the Responsible Data Use Policy Framework released last year, the Sidewalk Toronto team made a number of commitments with regard to privacy, such as not selling personal information to third parties or using it for advertising purposes. Daneilla further argues that privacy was declared a human right and is protected under the Universal Declaration of Human Rights adopted by the United Nations in 1948. However, in the Sidewalk Labs conversation, privacy has been framed as a purely digital tech issue. Debates have focused on questions of data access, who owns it, how will it be used, where it should all be stored and what should be collected. In other words it will collect the minutest information of an individual’s everyday living. For example, track what medical offices they enter, what locations they frequent and who their visitors are, in turn giving away clues to physical or mental health conditions, immigration status, whether if an individual is involved in any kind of sex work, their sexual orientation or gender identity or, the kind of political views they might hold. It will further affect their health status, employment, where they are allowed to live, or where they can travel further down the line. All of these raise a question: Do citizens want their data to be collected at this scale at all? And this conversation remains long overdue. Not all communities have agreed to participate in this initiative as marginalized and racialized communities will be affected most by surveillance. The Canadian Civil Liberties Association (CCLA) has threatened to sue Sidewalk Toronto project, arguing that privacy protections should be spelled out before the project proceeds. Toronto’s Mayor John Tory showed least amount of interest in addressing these concerns during a panel on tech investment in Canada at South by Southwest (SXSW) on March 10. Tory was present in the event to promote the city as a go-to tech hub while inviting the international audience at SXSW at the other industry events. Last October, Saadia Muzaffar announced her resignation from Waterfront Toronto's Digital Strategy Advisory Panel. "Waterfront Toronto's apathy and utter lack of leadership regarding shaky public trust and social license has been astounding," the author and founder of TechGirls Canada said in her resignation letter. Later that month, Dr. Ann Cavoukian, a privacy expert and consultant for Sidewalk Labs, put her resignation too. As she wanted all data collection to be anonymized or "de-identified" at the source, protecting the privacy of citizens. Why big tech really want your data? Data can be termed as a rich resource or the “new oil” in other words. As it can be mined in a number of ways, from licensing it for commercial purposes to making it open to the public and freely shareable. Apparently like oil, data has the power to create class warfare, permitting those who own it to control the agenda and those who don’t to be left at their mercy. With the flow of data now contributing more to world GDP than the flow of physical goods, there’s a lot at stake for the different players. It can benefit in different ways as for the corporate, it is the primary beneficiaries of personal data, monetizing it through advertising, marketing and sales. For example, Facebook for past 2 to 3 years has repeatedly come under the radar for violating user privacy and mishandling data. For the government, data may help in public good, to improve quality of life for citizens via data--driven design and policies. But in some cases minorities and poor are highly impacted by the privacy harms caused due to mass surveillance, discriminatory algorithms among other data driven technological applications. Also public and private dissent can be discouraged via mass surveillance thus curtailing freedom of speech and expression. As per NY Times report, low-income Americans have experienced a long history of disproportionate surveillance, the poor bear the burden of both ends of the spectrum of privacy harms; are subject to greater suspicion and monitoring while applying for government benefits and live in heavily policed neighborhoods. In some cases they also lose out on education and job opportunities. https://twitter.com/JulieSBrill/status/1122954958544916480 In more promising news, today the Oakland Privacy Advisory Commission released 2 key documents one on the Oakland privacy principles and the other on ban on facial recognition tech. https://twitter.com/cfarivar/status/1123081921498636288 They have given emphasis to privacy in the framework and mentioned that, “Privacy is a fundamental human right, a California state right, and instrumental to Oaklanders’ safety, health, security, and access to city services. We seek to safeguard the privacy of every Oakland resident in order to promote fairness and protect civil liberties across all of Oakland’s diverse communities.” Safety will be paramount for smart city initiatives, such as Sidewalk Toronto. But we need more Oakland like laws and policies that protect and support privacy and human rights. One where we are able to use technology in a safe way and things aren’t happening that we didn’t consent to. #NotOkGoogle: Employee-led town hall reveals hundreds of stories of retaliation at Google Google announces new policy changes for employees to report misconduct amid complaints of retaliation and harassment #GoogleWalkout organizers face backlash at work, tech workers show solidarity

0
0
3745

article-image-ai-can-now-help-speak-your-mind-uc-researchers-introduce-a-neural-decoder-that-translates-brain-signals-to-natural-sounding-speech

Bhagyashree R

29 Apr 2019

4 min read

AI can now help speak your mind: UC researchers introduce a neural decoder that translates brain signals to natural-sounding speech

Bhagyashree R

29 Apr 2019

4 min read

In a research published in the Nature journal on Monday, a team of neuroscientists from the University of California, San Francisco, introduced a neural decoder that can synthesize natural-sounding speech based on brain activity. This research was led by Gopala Anumanchipalli, a speech scientist, and Josh Chartier, a bioengineering graduate student in the Chang lab. It is being developed in the laboratory of Edward Chang, a Neurological Surgery professor at University of California. Why is this neural decoder being introduced? There are many cases of people losing their voice because of stroke, traumatic brain injury, or neurodegenerative diseases such as Parkinson’s disease, multiple sclerosis, and amyotrophic lateral sclerosis. Currently,assistive devices that track very small eye or facial muscle movements to enable people with severe speech disabilities express their thoughts by writing them letter-by-letter, do exist. However, generating text or synthesized speech with such devices is often time consuming, laborious, and error-prone. Another limitation these devices have is that they only permit generating a maximum of 10 words per minute, compared to the 100 to 150 words per minute of natural speech. This research shows that it is possible to generate a synthesized version of a person’s voice that can be controlled by their brain activity. The researchers believe that in future, this device could be used to enable individuals with severe speech disability to have fluent communication. It could even reproduce some of the “musicality” of the human voice that expresses the speaker’s emotions and personality. “For the first time, this study demonstrates that we can generate entire spoken sentences based on an individual’s brain activity,” said Chang. “This is an exhilarating proof of principle that with technology that is already within reach, we should be able to build a device that is clinically viable in patients with speech loss.” How does this system work? This research is based on another study by Josh Chartier and Gopala K. Anumanchipalli, which shows how the speech centers in our brain choreograph the movements of the lips, jaw, tongue, and other vocal tract components to produce fluent speech. In this new study, Anumanchipalli and Chartier asked five patients being treated at the UCSF Epilepsy Center to read several sentences aloud. These patients had electrodes implanted into their brains to map the source of their seizures in preparation for neurosurgery. Simultaneously, the researchers recorded activity from a brain region known to be involved in language production. The researchers used the audio recordings of volunteer’s voice to understand the vocal tract movements needed to produce those sounds. With this detailed map of sound to anatomy in hand, the scientists created a realistic virtual vocal tract for each volunteer that could be controlled by their brain activity. The system comprised of two neural networks: A decoder for transforming brain activity patterns produced during speech into movements of the virtual vocal tract. A synthesizer for converting these vocal tract movements into a synthetic approximation of the volunteer’s voice. Here’s a video depicting the working of this system: https://www.youtube.com/watch?v=kbX9FLJ6WKw&feature=youtu.be The researchers observed that the synthetic speech produced by this system was much better as compared to the synthetic speech directly decoded from the volunteer’s brain activity. The generated sentences were also understandable to hundreds of human listeners in crowdsourced transcription tests conducted on the Amazon Mechanical Turk platform. The system is still in its early stages. Explaining its limitations, Chartier said, “We still have a ways to go to perfectly mimic spoken language. We’re quite good at synthesizing slower speech sounds like ‘sh’ and ‘z’ as well as maintaining the rhythms and intonations of speech and the speaker’s gender and identity, but some of the more abrupt sounds like ‘b’s and ‘p’s get a bit fuzzy. Still, the levels of accuracy we produced here would be an amazing improvement in real-time communication compared to what’s currently available.” Read the full report on UCSF’s official website. OpenAI introduces MuseNet: A deep neural network for generating musical compositions Interpretation of Functional APIs in Deep Neural Networks by Rowel Atienza Google open-sources GPipe, a pipeline parallelism Library to scale up Deep Neural Network training

0
0
16450

Tech News - Data

Elvis Pranskevichus on limitations in SQL and how EdgeQL can help

‘Tableau Day’ highlights: Augmented Analytics, Tableau Prep Builder and Conductor, and more!

Artist Holly Herndon releases an album featuring an artificial intelligence 'musician'

Singapore passes controversial bill that criminalizes publishing “fake” news

Linux forms Urban Computing Foundation: Set of open source tools to build autonomous vehicles and smart infrastructure

Sherin Thomas explains how to build a pipeline in PyTorch for deep learning workflows

#MSBuild2019: Microsoft launches new products to secure elections and political campaigns

OpenAI: Two new versions and the output dataset of GPT-2 out!

DuckDuckGo proposes “Do-Not-Track Act of 2019” to require sites to respect DNT browser setting

An unsupervised deep neural network cracks 250 million protein sequences to reveal biological structures and functions

Trending Topics

Palantir’s software was used to separate families in a 2017 operation reveals Mijente

Deeplearning4J 1.0.0-beta4 released with full multi-datatype support, new attention layers, and more!

Oakland Privacy Advisory Commission lay out privacy principles for Oaklanders and propose ban on facial recognition tech

Google’s Sidewalk Lab smart city project threatens privacy and human rights: Amnesty Intl, CA says

AI can now help speak your mind: UC researchers introduce a neural decoder that translates brain signals to natural-sounding speech