Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds

Tech Guides

851 Articles
article-image-9-data-science-myths-debunked
Amey Varangaonkar
03 Jul 2018
9 min read
Save for later

9 Data Science Myths Debunked

Amey Varangaonkar
03 Jul 2018
9 min read
The benefits of data science are evident for all to see. Not only does it equip you with the tools and techniques to make better business decisions, the predictive power of analytics also allows you to determine future outcomes - something that can prove to be crucial to businesses. Despite all these advantages, data science is a touchy topic for many businesses. It’s worth looking at some glaring stats that show why businesses are reluctant to adopt data science: Poor data across businesses and organizations - in both private and government costs the U.S economy close to $3 Trillion per year. Only 29% enterprises are able to properly leverage the power of Big Data and derive useful business value from it. These stats show a general lack of awareness or knowledge when it comes to data science. It could be due to some preconceived notions, or simply lack of knowledge and its application that seems to be a huge hurdle to these companies. In this article, we attempt to take down some of these notions and give a much clearer picture of what data science really is. Here are 5 of the most common myths or misconceptions in data science, and why are absolutely wrong: Data Science is just a fad, it won’t last long This is probably the most common misconception. Many tend to forget that although ‘data science’ is a recently coined term, this field of study is a cumulation of decades of research and innovation in statistical methodologies and tools. It has been in use since the 1960s or even before - just that the scale at which it was being used then was small. Back in the day, there were no ‘data scientists’, but just statisticians and economists who used the now unknown terms such as ‘data fishing’ or ‘data dredging’. Even the terms ‘data analysis’ and ‘data mining’ only went mainstream in the 1990s, but they were in use way before that period. Data Science’s rise to fame has coincided with the exponential rise in the amount of data being generated every minute. The need to understand this information and make positive use of it led to an increase in the demand for data science. Now with Big Data and Internet of Things going wild, the rate of data generation and the subsequent need for its analysis will only increase. So if you think data science is a fad that will go away soon, think again. Data Science and Business Intelligence are the same Those who are unfamiliar with what data science and Business Intelligence actually entail often get confused, and think they’re one and the same. No, they’re not. Business Intelligence is an umbrella term for the tools and techniques that give answers to the operational and contextual aspects of your business or organization. Data science, on the other hand has more to do with collecting information in order to build patterns and insights. Learning about your customers or your audience is Business Intelligence. Understanding why something happened, or whether it will happen again, is data science. If you want to gauge how changing a certain process will affect your business, data science - not Business Intelligence - is what will help you. Data Science is only meant for large organizations with large resources Many businesses and entrepreneurs are wrongly of the opinion that data science is - or works best - only for large organizations. It is a wrongly perceived notion that you need sophisticated infrastructure to process and get the most value out of your data. In reality, all you need is a bunch of smart people who know how to get the best value of the available data. When it comes to taking a data-driven approach, there’s no need to invest a fortune in setting up an analytics infrastructure for an organization of any scale. There are many open source tools out there which can be easily leveraged to process large-scale data with efficiency and accuracy. All you need is a good understanding of the tools. It is difficult to integrate data science systems with the organizational workflow With the advancement of tech, one critical challenge that has now become very easy to overcome is to collaborate with different software systems at once. With the rise of general-purpose programming languages, it is now possible to build a variety of software systems using a single programming language. Take Python for example. You can use it to analyze your data, perform machine learning or develop neural networks to work on more complex data models. All this while, you can link your web API designed in Python to communicate with these data science systems. There are provisions being made now to also integrate codes written in different programming languages while ensuring smooth interoperability and no loss of latency. So if you’re wondering how to incorporate your analytics workflow in your organizational workflow, don’t worry too much. Data Scientists will be replaced by Artificial Intelligence soon Although there has been an increased adoption of automation in data science, the notion that the work of a data scientist will be taken over by an AI algorithm soon is rather interesting. Currently, there is an acute shortage of data scientists, as this McKinsey Global Report suggests. Could this change in the future? Will automation completely replace human efforts when it comes to data science? Surely machines are a lot better than humans at finding patterns; AI best the best go player, remember. This is what the common perception seems to be, but it is not true. However sophisticated the algorithms become in automating data science tasks, we will always need a capable data scientist to oversee them and fine-tune their performance. Not just that, businesses will always need professionals with strong analytical and problem solving skills with relevant domain knowledge. They will always need someone to communicate the insights coming out of the analysis to non-technical stakeholders. Machines don’t ask questions of data. Machines don’t convince people. Machines don’t understand the ‘why’. Machines don’t have intuition. At least, not yet. Data scientists are here to stay, and their demand is not expected to go down anytime soon. You need a Ph.D. in statistics to be a data scientist No, you don’t. Data science involves crunching numbers to get interesting insights, and it often involves the use of statistics to better understand the results. When it comes to performing some advanced tasks such as machine learning and deep learning, sure, an advanced knowledge of statistics helps. But that does not imply that people who do not have a degree in maths or statistics cannot become expert data scientists. Today, organizations are facing a severe shortage of data professionals capable of leveraging the data to get useful business insights. This has led to the rise of citizen data scientists - meaning professionals who are not experts in data science, but can use the data science tools and techniques to create efficient data models. These data scientists are no experts in statistics and maths, they just know the tool inside out, ask the right questions, and have the necessary knowledge of turning data into insights. Having an expertise of the data science tools is enough Many people wrongly think that learning a statistical tool such as SAS, or mastering Python and its associated data science libraries is enough to get the data scientist tag. While learning a tool or skill is always helpful (and also essential), by no means is it the only requisite to do effective data science. One needs to go beyond the tools and also master skills such as non-intuitive thinking, problem-solving, and knowing the correct practical applications of a tool to tackle any given business problem. Not just that, it requires you to have excellent communication skills to present your insights and findings related to the most complex of analysis to other stakeholders, in a way they can easily understand and interpret. So if you think that a SAS certification is enough to get you a high-paying data science job and keep it, think again. You need to have access to a lot of data to get useful insights Many small to medium-sized businesses don’t adopt a data science framework because they think it takes lots and lots of data to be able to use the analytics tools and techniques. Data when present in bulk, always helps, true, but don’t need hundreds of thousands of records to identify some pattern, or to extract relevant insights. Per IBM, data science is defined by the 4 Vs of data, meaning Volume, Velocity, Veracity and Variety. If you are able to model your existing data into one of these formats, it automatically becomes useful and valuable. Volume is important to an extent, but it’s the other three parameters that add the required quality. More data = more accuracy Many businesses collect large hordes of information and use the modern tools and frameworks available at their disposal for analyzing this data. Unfortunately, this does not always guarantee accurate results. Neither does it guarantee useful actionable insights or more value. Once the data is collected, the preliminary analysis on what needs to be done with the data is required. Then, we use the tools and frameworks at our disposal to extract the relevant insights and built an appropriate data model. These models need to be fine-tuned as per the processes for which they will be used. Then, eventually, we get the desired degree of accuracy from the model. Data in itself is quite useless. It’s how we work on it - more precisely, how effectively we work on it - that makes all the difference. So there you have it! Data science is one of the most popular skills to have in your resume today, but it is important to first clear all the confusions and misconceptions that you may have about it. Lack of information or misinformation can do more harm than good, when it comes to leveraging the power of data science within a business - especially considering it could prove to be a differentiating factor for its success and failure. Do you agree with our list? Do you think there are any other commonly observed myths around data science that we may have missed? Let us know. Read more 30 common data science terms explained Why is data science important? 15 Useful Python Libraries to make your Data Science tasks Easier
Read more
  • 0
  • 0
  • 6015

article-image-crypto-ransomware
Savia Lobo
23 May 2018
7 min read
Save for later

Anatomy of a Crypto Ransomware

Savia Lobo
23 May 2018
7 min read
Crypto ransomware is the worst threat at present. There are a lot of variants in crypto ransomware. Only some make it into the limelight, while others fade away. In this article, you will get to know about Crypto Ransomware and how one can code it easily in order to encrypt certain directories and important files. The reason for a possible increase in the use of crypto ransomware could be because coding it is quite easy compared to other malware. The malware just needs to browse through user directories to find relevant files that are likely to be personal and encrypt them. The malware author need not write complex code, such as writing hooks to steal data. Most crypto ransomwares don't care about hiding in the system, so most do not have rootkit components either. They only need to execute on the system once to encrypt all files. Some crypto ransomwares also check to see whether the system is already infected by other crypto ransomware. There is a huge list of crypto ransomware. Here are a few of them: Locky Cerber CryptoLocker Petya This article is an excerpt taken from the book, 'Preventing Ransomware' written by Abhijit Mohanta, Mounir Hahad, and Kumaraguru Velmurugan.  How does crypto ransomware work? Crypto ransomware technically does the following things: Finds files on the local system. On a Windows machine, it can use the FindFirstFile(), FindNextFile() APIs to enumerate files directories. A lot of ransomware also search for files present on shared drives It next checks for the file extension that it needs to encrypt. Most have a hardcoded list of file extensions that the ransomware should encrypt. Even if it encrypts executables, it should not encrypt any of the system executables. It makes sure that you should not be able to restore the files from backup by deleting the backup. Sometimes, this is done by using the vssadmin tool. A lot of crypto ransomwares use the vssadmin command, provided by Windows to delete shadow copies. Shadow copies are backups of files and volumes. The vssadmin (vss administration) tool is used to manage shadow copies. VSS in is the abbreviation of volume shadow copy also termed as Volume Snapshot Service. The following is a screenshot of the vssadmin tool: After encrypting the files ransomware leaves a note for the victim. It is often termed a ransom note and is a message from the ransomware to the victim. It usually informs the victim that the files on his system have been encrypted and to decrypt them, he needs to pay a ransom. The ransom note instructs the victim on how to pay the ransom. The ransomware uses a few cryptographic techniques to encrypt files, communicate with the C&C server, and so on. We will explain this in an example in the next section. But before that, it's important to take a look at the basics of cryptography. Overview of cryptography A lot of cryptographic algorithms are used by malware today. Cryptography is a huge subject in itself and this section just gives a brief overview of cryptography. Malware can use cryptography for the following purposes: To obfuscate its own code so that antivirus or security researchers cannot identify the actual code easily. To communicate with its own C&C server, sometimes to send hidden commands across the network and sometimes to infiltrate and steal data To encrypt the files on the victim machine A cryptographic system can have the following components: Plaintext Encryption key Ciphertext, which is the encrypted text Encryption algorithm, also called cipher Decryption algorithm There are two types of cryptographic algorithms based on the kind of key used: Symmetric Asymmetric A few assumptions before explaining the algorithm: the sender is the person who sends the data after encrypting it and the receiver is the person who decrypts the data with a key. Symmetric key In symmetric key encryption, the same key is used by both sender and receiver, which is also called the secret key. The sender uses the key to encrypt the data while the receiver uses the same key to decrypt. The following algorithms use a symmetric key: RC4 AES DES 3DES BlowFish Asymmetric key A symmetric key is simpler to implement but it faces the problem of exchanging the keys in a secure manner. A public or asymmetric key has overcome the problem of key exchange by using a pair of keys: public and private. A public key can be distributed in an unsecured manner, while the private key is always kept with the owner secretly. Any one of the keys can be used to encrypt and the other can be used to decrypt: Here, the most popular algorithms are: RSA Diffie Hellman ECC DSA Secure protocols such as SSH have been implemented using public keys. How does ransomware use cryptography? Crypto ransomware started with simple symmetric key cryptography. But soon, researchers could decode these keys easily. So, they started using an asymmetric key. Ransomware of the current generation has started using both symmetric and asymmetric keys in a smart manner. CryptoLocker is known to use both a symmetric key and an asymmetric key. Here is the encryption process used by CryptoLocker: When CryptoLocker infects a machine, it connects to its C&C and requests a public key. An RSA public and secret key pair is generated for that particular victim machine. The public key is sent to the victim machine but the secret key or private key is retained with the C&C server. The ransomware on the victim machine generates an AES symmetric key, which is used to encrypt files. After encrypting a file with AES key, CryptoLocker encrypts the AES key with the RSA public key obtained from C&C server. The encrypted AES key along with the encrypted file contents are written back to the original file in a specific format. So, in order to get the contents back, we need to decrypt the encrypted AES key, which can only be done using the private key present in the C&C server. This makes decryption close to impossible. Analyzing crypto ransomware The malware tools and concepts remain the same here too. Here are few observations while analyzing, specific to crypto ransomwares, that are different compared to other malware. Usually, crypto ransomware, if executed, does a large number of file modifications. You can see the changes in the filemon or procmon tools from Sysinternals File extensions are changed in a lot of cases. In this case, it is changed to .scl. The extension will vary with different crypto ransomware. A lot of the time, a file with a ransom note is present on the system. The following image shows a file with a ransom note: Ransom notes are different for different kinds of ransomware. Ransom notes can be in HTML, PDF, or text files. The ransom note's file usually has decrypt instructions in the filename. Prevention and removal techniques for crypto ransomware In this case, prevention is better than cure. It's hard to decrypt the encrypted files in most cases. Security vendors came up with decryption tool to decrypt the ransomware encrypted files. There was a large increase in the number of ransomware and an increase in complexity of the encryption algorithms used by them. Hence, the decryption tools created by the ransomware vendors failed to cope sometimes. http://www.thewindowsclub.com/list-ransomware-decryptor-tools gives you a list of tools meant to decrypt ransomware encrypted files. These tools may not work in all cases of ransomware encryption. If you've enjoyed reading this post, do check out  'Preventing Ransomware' to have an end-to-end knowledge of the trending malware in the tech industry at present. Top 5 cloud security threats to look out for in 2018 How cybersecurity can help us secure cyberspace Cryptojacking is a growing cybersecurity threat, report warns
Read more
  • 0
  • 0
  • 6002

article-image-how-artificial-intelligence-and-machine-learning-can-turbocharge-a-game-developers-career
Guest Contributor
06 Sep 2018
7 min read
Save for later

How Artificial Intelligence and Machine Learning can turbocharge a Game Developer's career

Guest Contributor
06 Sep 2018
7 min read
Gaming - whether board games or games set in the virtual realm - has been a massively popular form of entertainment since time immemorial. In the pursuit of creating more sophisticated, thrilling, and intelligent games, game developers have delved into ML and AI technologies to fuel innovation in the gaming sphere. The gaming domain is the ideal experimentation bed for evolving technologies because not only do they put up complex and challenging problems for ML and AI to solve, they also pose as a ground for creativity - a meeting ground for machine learning and the art of interaction. Machine Learning and Artificial Intelligence in Gaming The reliance on AI for gaming is not a recent development. In fact, it dates back to 1949, when the famous cryptographer and mathematician Claude Shannon made his musings public about how a supercomputer could be made to master Chess. Then again, in 1952, a graduate student in the UK developed an AI that could play tic-tac-toe with ultimate perfection. Source : Medium However, it isn’t just ML and AI that are progressing through experimentations on games. Game development, too, has benefited a great deal from these pioneering technologies. AI and ML have helped enhance the gaming experience on many grounds such as gaming design, the interactive quotient, as well as the inner functionalities of games. The above mentioned AI use cases focus on two primary things: one is to impart enhanced realism in virtual gaming environment and the second is to create a more naturalistic interface between the gaming environment and the players. As of now, the focus of game developers, data scientists, and ML researchers lies in two specific categories of the gaming domain - games of perfect information and games of imperfect information. In games of perfect information, a player is aware of all the aspects of the game throughout the playing session, whereas, in games of imperfect information, players are oblivious to specific aspects of the game. When it comes to games of perfect information such as Chess and Go, AI has shown various instances of overpowering human intelligence. Back in 1997, IBM’s Deep Blue successfully defeated world Chess champion, Garry Kasparov in a six-game match. In 2016, Google’s AlphaGo emerged as the victor in a Go match scoring 4-1 after defeating South Korean Go champion, Lee Sedol. One of the most advanced chess AIs developed yet, Stockfish, uses a combination of advanced heuristics and brute force to compute numeric values for each and every move in a specific position in Chess. It also effectively eliminates bad moves using the Alpha-beta pruning search algorithm. While the progress and contribution of AI and ML to the field of games of perfect information is laudable, researchers are now intrigued by games of imperfect information. Games of imperfect information offer much more challenging situations that are essentially difficult for machines to learn and master. Thus, the next evolution in the world of gaming will be to create spontaneous gaming environment using AI technology, in which developers will build only the gaming environment and its mechanics instead of creating a game with pre-programmed/scripted plots. In such a scenario, the AI will have to confront and solve spontaneous challenges with personalized scenarios generated on the spot. Games like StarCraft and StarCraft II have stirred up massive interest among game researchers and developers. In these games, the players are only partially aware of the gaming aspects and the game is largely determined not just by the AI moves and the previous state of the game, but also by the moves of other players. Since in these games you will have little knowledge about your rival’s moves, you have to take decisions on the go and your moves have to be spontaneous. The recent win of OpenAI Five over amateur human players in Dota2 is a good case in point. OpenAI Five is a team of five neural networks that leverages an advanced version of Proximal Policy Optimization and uses a separate LSTM to learn identifiable strategies. The progress of OpenAI Five shows that even without human data, reinforcement learning can facilitate long-term planning, thus, allowing us to make further progress in the games of imperfect information. Career in Game Development With ML and AI As ML and AI continue to penetrate the gaming industry, it is creating a huge demand for talented and skilled game developers who are well-versed in these technologies. Today, game development is at a place where it’s no longer necessary to build games using time-consuming manual techniques. ML and AI have made the task of game developers easier as by leveraging these technologies, they can design and build innovative gaming environment, and test them automatically. The integration of AI and ML in the gaming domain is giving birth to new job positions like Gameplay Software Engineer (AI), Gameplay Programmer (AI), and Game Security Data Scientist, to name a few. The salaries of traditional game developers is in stark contrast with that of those having AI/ML skills. While the average salary of game developers is usually around $44,000, it can scale up to and over $1,20,000 if one possesses AI/ML skills. Gameplay Engineer Average salary - $73,000 - $1,16,000 Gameplay engineers are usually part of the core game dev team and are entrusted with the responsibility of enhancing the existing gameplay systems to enrich the player experience. Companies today demand for gameplay engineers who are proficient in C/C++ and well-versed with AI/ML technologies. Gameplay Programmer Average salary - $98,000 - $1,49,000 Gameplay programmers work in close collaboration with the production and design team to develop cutting edge features in the existing and upcoming gameplay systems. Programming skills are a must and knowledge of AI/ML technologies is an added bonus. Game Security Data Scientist Average salary - $73,000 - $1,06,000 The role of a gameplay security data scientist is to combine both security and data science approaches to detect anomalies and fraudulent behavior in games. This calls for a high degree of expertise in AI, ML, and other statistical methods. With impressive salaries and exciting job opportunities cropping up fast in the game development sphere, the industry is attracting some major talent towards it. Game developers and software developers around the world are choosing the field due to the promises of rapid career growth. If you wish to bag better and more challenging roles in the domain of game development, you should definitely try and upskill your talent and knowledge base by mastering the fields of ML and AI. Packt Publishing is the leading UK provider of Technology eBooks, Coding eBooks, Videos and Blogs; helping IT professionals to put software to work. It offers several books and videos on Game development with AI and machine learning. It’s never too late to learn new disciplines and expand your knowledge base. There are numerous online platforms that offer great artificial intelligent courses. The perk of learning from a registered online platform is that you can learn and grow at your own pace and according to your convenience. So, enroll yourself in one and spice up your career in game development! About Author: Abhinav Rai is the Data Analyst at UpGrad, an online education platform providing industry oriented programs in collaboration with world-class institutes, some of which are MICA, IIIT Bangalore, BITS and various industry leaders which include MakeMyTrip, Ola, Flipkart etc.   Best game engines for AI game development Implementing Unity game engine and assets for 2D game development [Tutorial] How to use arrays, lists, and dictionaries in Unity for 3D game development      
Read more
  • 0
  • 0
  • 5963

article-image-is-serverless-architecture-a-good-choice-for-app-development
Mehul Rajput
11 Oct 2019
6 min read
Save for later

Is serverless architecture a good choice for app development?

Mehul Rajput
11 Oct 2019
6 min read
App development has evolved rapidly in recent years. With new demands and expectations from businesses and users, trends like cloud have helped developers to be more productive and to build faster, more reliable and secure applications. But, there’s no end to evolution - and serverless is arguably the next step for application development. But is a serverless architecture the right choice? What is a Serverless Architecture? When you hear the word sever-less, you might assume that it means no servers. In actual fact it really refers to the elimination of the need to manage the servers. Instead, it shifts that responsibility to your cloud provider. Simply, it means that the constituent parts of an application are divided between multiple servers, with no need for the application owner/manager to create or manage the infrastructure that supports it. Instead of running off a server, with a serverless architecture, it runs off functions. These are essentially actions that are fired off to ensure things happen within the application. This is where the phrase ‘function-as-a-service’, or FaaS, (another way of describing serverless) comes from.  A recent report claims that the FaaS market is projected to grow up to 32.7% by 2021, by 7.72 billion US dollars. Is Serverless Architecture a Good Choice for App Development? Now that we’ve established what the serverless actually means, we must get to business. Is serverless architecture the right choice for app development? Well, it can work either way. It can be positive as well as negative. Here are some reasons: Using serverless for app development: the positives There are many reasons because of which serverless architecture can be good for app development and should be used. Some of them are discussed below: Decreasing costs Easier for service Scalability Third-party services Decreasing costs The most effective use of a serverless architecture in an app development process is that it reduces the costs of the work.It’s typically less expensive a ‘traditional’ server architecture. The reason is that on hardware servers, you have to pay for many different things that might not be  required. For example, you won’t have to pay for regular maintenance, the grounds, the electricity, and staff maintenance. Hence, you can save a considerable amount of money and use that for app quality as well. Easier for service It is a rational thought that when the owner or the app manager will not have to manage the server themselves, and a machine can do this job, then it won’t be as challenging to make the service accessible. As it will make the job more comfortable because it will not require supervision. Second, you will not have to spend time on it. Instead, you can use this time for productive work such as product development. Third, the service by this technology is reliable, and hence you can easily use it without much fear. Scalability Now another interestingly useful advantage of serverless architecture in app development is scalability. So, what is scalability? Well, it is the phenomenon by which a system handles an extra amount of work by adding resources to the system. It is the capability of an app or product to continue to work appropriately without disturbance when it is reformed in size or volume to meet any users need. So, serverless architecture act as the resource that is added to the system to handle any work that has piled up. Third-party services Another essential and useful feature of the serverless architecture is that, going this way you can use third-party services. Hence, your app can use any third-party service it requires other than what you already have. This way, the struggle needed to create the backend architecture of the app reduces. Additionally the third-party might provide us with better services than we already have. Hence, eventually, serverless architecture proves to be better as it provides the extent of a third-party. Serverless for app development: negatives Now we know all the advantages of a serverless architecture, it’s important to note that it can also it  bring some limitations and disadvantages. These are: Time restrictions Vendor lock-in Multi-tenancy Debugging is not possible Time restrictions As mentioned before, serverless architecture works on FaaS rules and has a time limit for running a function. This time limit is 300 seconds exactly. So, when this limit is reached, the function is stopped. Therefore, for more complex functions that require more time to execute, FaaS approach may not be a good choice. Although this problem can be tackled in a way that the problem is solved easily, to do this, we can split a task into several simpler functions if the task allows it. Otherwise, time restrictions like these can cause great difficulty. Vendor lock-in We have discussed that by using serverless architecture, we can utilize with third party services. Well, it can also go in the wrong way and cause vendor lock-in. If, for any reason, you decide to shift to a new service provider, in most cases services will be fulfilled in a different way. That means the productivity gains you expected from serverless will be lost as you will have to adjust and reconfigure the infrastructure to accept the new service. Multi-tenancy Multi-tenancy is an increasing problem in serverless architecture. The data of many tenants are kept quite near to each other. This can create  confusion. Some data might be exchanged, distributed, or probably lost. In turn, this can cause security and reliability issues. A customer could, for example, suddenly produce an extraordinarily high load which would affect other customers' applications. Debugging is not possible Debugging isn’t possible with serverless. As Serverless Architecture is a place where the data is being stored, it doesn’t have a debugging facility where the uploaded code can be debugged. If you want to know the function, run or perform it and wait for the result. The result can crash in the function and you cannot do anything about this. However, there is a way to resolve this problem, as well. You can use extensive logging which with every step being logged, decreases the chances of errors that cause debugging issues. Conclusion Serverless architecture certainly seems impressive in spite of having some limitations. There is no doubt that the viability and success of architectures depend on the business requirements and of course on the technology used. In the same way, serverless can sparkle bright if used in the appropriate case. I hope this blog might have helped you in the understanding of Serverless architecture for mobile apps and might be able to see it's both bright and dark sides. Author Bio Mehul Rajput is a CEO and co-founder of Mindinventory which specializes in Android and iOS app development and provide web and mobile app solutions from startup to enterprise level businesses. He is an avid blogger and writes on mobile technologies, mobile app, app marketing, app development, startup and business.   What is serverless architecture and why should I be interested? Introducing numpywren, a system for linear algebra built on a serverless architecture Serverless Computing 101 Modern Cloud Native architectures: Microservices, Containers, and Serverless – Part 1 Modern Cloud Native architectures: Microservices, Containers, and Serverless – Part 2
Read more
  • 0
  • 0
  • 5958

article-image-how-build-secure-microservices
Rick Blaisdell
13 Jul 2017
4 min read
Save for later

How to build secure microservices

Rick Blaisdell
13 Jul 2017
4 min read
A few years back, everybody was looking for an architecture that would make web and mobile application development more flexible, reliable, efficient, and scalable. In 2014, we found the answer when an innovative architectural solution was developed—Microservice. The fastest growing companies are built around microservices. What makes microservice architecture fascinating is its characteristics: Microservices are organized around competencies like recommendations, front-end, user interface. You can implement them using various programming languages, databases, software, and environment. The services lend themselves to a continuous delivery software development process. If there are any changes produced in the application, it requires only a few changes in a service. Easy to replace with other microservices. These services are independently deployable, autonomously developed, and messaging enabled.  So, it’s easy to understand why a microservice architecture is a perfect way to accelerate both web and mobile application development. However, one needs to understand how to build secure microservices. Security is the top priority for every business. Designing a safe microservices architecture can be simple if you follow these guidelines: Define access control and authorization – This is one of the crucial steps in reaching a higher level of security. It’s important to understand first how each microservice could be compromised and what damage could be done. This will make it much easier for you to develop a strategy that could safeguard against these incidents.  Map communications – Outlining the entire communication methods between microservices will give you valuable insights on any vulnerability that might eventually be exploited in case of a malicious attack. Use centralized security or configuration policies – Human error is one of the most common reasons why platforms, devices, or networks get hacked or damaged. It’s a fact! Employing a centralized security or configuration policy will reduce the human interaction with the microservices, and will build the long-desired consistency. Establish common, repeatable coding standards – The repeatable coding standards must be set up right from the developing stage. It will reduce certain divergences that might lead to exploitable vulnerabilities. Use ‘defense in depth’ to authorize vital services – From our experience, we know that a single firewall is not strong enough to protect our entire software. Thus, enabling a multi-factor authentication method, which places multiple layers of security controls is an effective way to ensure a robust security level. Use automatic security updates – This is crucial and easy to set up. Review microservices code – Having multiple experts reviewing the code is a great way of making sure that errors have not slipped through the cracks. Deploy an API gateway – If you expose one or more APIs for external access, then deploying an API gateway could reduce security risks. Moreover, you need to make sure that all API traffic is being encrypted using TSL. Actually, TSL should be used for all internal communications, right from the beginning to ensure the security of your systems. Use intrusion tools and request fuzzers – We all know that it is better to find issues before an attacker does. So, the technique ‘fuzz’ is a method that finds code vulnerabilities by sending large quantities of random data to the systems. This approach will ultimately highlight if the code could be compromised and what could cause it to fail.  Now that we’re all set with the security measures required for building a microservices, I would like to make a quick overview of the benefits that this innovative architecture has to offer: Fewer dependencies between teams Run multiple initiatives in parallel Support various technologies, frameworks, or languages Promotes ease of innovation through disposable code  Besides the tangible advantages named above, microservices are delivering increased value to your business, such as agility, comprehensibility of the software systems, independent deployability of components, and organizational alignment of services. I hope that this article will help you build a secure microservices architecture that will add value to your business.  About the Author Rick Blaisdell is an experienced CTO, offering cloud services and creating technical strategies, which reduce IT operational costs and improve efficiency. He has 20 years of product, business development, and high-tech experience with Fortune 500 companies, developing innovative technology strategies.
Read more
  • 0
  • 0
  • 5932

article-image-what-is-distributed-computing-and-whats-driving-its-adoption
Melisha Dsouza
07 Nov 2018
8 min read
Save for later

What is distributed computing and what's driving its adoption?

Melisha Dsouza
07 Nov 2018
8 min read
Distributed computing is having a real impact on the way companies look at the cloud. The "Most Promising Jobs 2018" report published by LinkedIn pointed out that distributed and cloud Computing rank amongst the top 10 most in-demand skills. What are the problems with centralized computing systems? Distributed computing solves many of the challenges that centralized computing systems pose today. These centralized systems - like IBM Mainframes - have been around for decades, but they’re beginning to lose favor. This is because centralized computing is ineffective and expensive in the context of increasing data and workloads. When you have a single central computer which controls a massive amount of computations - at the same time - it’s a massive strain on the system. Even one that’s particularly powerful. Centralized systems simply aren’t capable of processing huge volumes of transactional data and supporting tons of online users concurrently. There’s also a big issue with reliability. If your centralized server fails, all data could be permanently lost if you have no disaster recovery strategy. Fortunately, distributed computing offers solutions to many of these issues. How does distributed computing work? Distributed Computing comprises a group of systems located at different places, all connected over a network. They work on a single problem or a common goal. Each one of these systems is autonomous, programmable, asynchronous and failure-prone. These systems provide a better price/performance ratio when compared to a centralized system. This is because it’s more economical to add microprocessors rather than mainframes to your network. They have more computational power as compared to their centralized (mainframe) computing systems. Distributed computing and agility Another major plus point of distributed computing systems is that they provide much greater agility than centralized computing systems. Without centralization, organizations can add and change software and computational power according to the demands and needs of the business. With the reduction in price for computing power and storage thanks to the rise of public cloud services like AWS, organizations all over the world have begun using distributed systems and service-oriented architectures, like microservices. Distributed computing in action: Google search A perfect example of distributed computing in action is Google search. When a user submits a query, Google will use data from a number of different servers to deliver results, based on things like location, past searches, semantic keywords - and much, much more. These servers are located all around the world and are able to provide the search result in seconds or at time milliseconds. How cloud is driving the adoption of distributed computing Central to the adoption is the cloud. Today, cloud is mainstream and opens up the possibility of distributed systems to organizations in a number of different ways. Arguably, you’re not really seeing the full potential of cloud until you’ve moved to a distributed system. Let’s take a look at the different ways cloud services are helping companies feel confident enough to successfully leverage distributed computing. Infrastructure as a Service (IaaS) IaaS makes distributed systems accessible for many organizations by allowing them to host their infrastructure either internally on a private or public cloud. Essentially, they give an organization control over the operating system and platform that forms the foundation of their software infrastructure, but give an external cloud provider control over servers and virtualization technologies that make it possible to deploy that infrastructure. In the context of a distributed system, this means organizations have less to worry about. As you can imagine, without an IaaS, the process of developing and deploying a distributed system becomes much more complex and even costly. Platform as a Service: Custom Software on another Platform If IaaS effectively splits responsibilities between the organization and the cloud provider (the ‘service’), the platform as a Service (PaaS) ‘outsources’ even more to the cloud provider. Essentially, an organization simply has to handle the applications and data, leaving every other aspect of their infrastructure to the platform. This brings many benefits, and, in theory, should allow even relatively small engineering teams to take advantage of the benefits of a distributed system. The underlying complexity and heavy lifting that a distributed system brings rests with the cloud provider, allowing an organization’s engineers to focus on what matters most - shipping code. If you’re thinking about speed and innovation, then a PaaS opens that right up, provided your happy to allow your cloud provider to manage the bulk of your infrastructure. Software as a Service SaaS solutions are perhaps the clearest example of a distributed system. Arguably, given the way we use Saas today, it’s easy to forget that it can be a part of a distributed system. The concept is simple: it’s a complete software solution delivered to the end-user. If you’re trying to accomplish something particularly complex, something which you simply do not have the resources to do yourself, a SaaS solution could be effective. Users don’t need to worry about installing and maintaining software, they can simply access it via the internet   The biggest advantages of adopting a distributed computing system #1 Complete control on the system architecture Distributed computing opens up your options when it comes to system architecture. Although you might rely on an external cloud service for some resources (like compute or storage), the architectural decisions are ultimately yours. This means that you can make decisions based on exactly what your organization needs and how it works. In a sense, this is why distributed computing can bring you agility - but its not just about being agile in the strict sense, but also in a broader version of the word. It allows you to prioritize according to your own needs and demands. #2 Improve the “absolute performance” of the computing system Tasks can be partitioned into sub computations that can run concurrently. This, in turn, provides a total speedup of task completion. What’s more, if a particular site is currently overloaded with jobs, some of them can be moved to lightly loaded sites. This technique of ‘load sharing’ can boost the performance of your system. Essentially, distributed systems minimize the latency and response time while increasing the throughput. [caption id="attachment_23973" align="alignnone" width="1536"]  [/caption] #3  The Price to Performance ratio for the system Distributed networks offer a better price/performance ratio compared to centralized mainframe computers. This is because decentralized and modular applications can share expensive peripherals, such as high-capacity file servers and high-resolution printers. Similarly, multiple components can be run on nodes with specialized processing. This further reduces the cost of multiple specialized processing systems. #4 Disaster Recovery Distributed systems involve services communicating through different machines. This is where message integrity, confidentiality and authentication comes into play. In such a case, distributed computing gives organizations the flexibility to deploy a 4 way mechanism to keep operations secure: Encryption Authentication Authorization: Auditing: Another aspect of disaster recovery is reliability. If computation and the associated data effectively built into a single machine, and if that machine goes down, the entire service goes with it. With a distributed system, what could happen instead is that specific services might go down, but the whole thing should, in theory at least, stay standing. #5 Resilience through replication So, if specific services can go down within a distributed system, you still do need to do something to increase resilience. You do this by replicating services across multiple nodes, minimizing potential points of failure. This is what’s known as fault tolerance - it improves system reliability without affecting the system as a whole. It’s also worth pointing out that the hardware on which a distributed system is built is replaceable - this is better than depending on centralized hardware which, if it fails, will take everything with it… Another distributed computing example: SETI A good example of a distributed system is SETI. SETI collects massive amounts of data from observatories around the world on activity in the sky, in a bid to identify possible signs of extraterrestrial life. This information is then sliced into smaller pieces of data for easy analysis through distributed computing applications running as a screensaver on individual user PC’s, all around the world. The PC’s running the SETI screensaver will download a small file, and while a PC is unused, the screen saver downloads a data slice from SETI. It then runs the analytics application while the PC is idle, and when the analysis is complete, the analyzed data slice is uploaded back to SETI. This massive data analytics is possible all because of distributed computing. So, although distributed computing has become a bit of a buzzword, the technology is gaining traction in the minds of customers and service providers. Beyond the hype and debate, these services will ultimately help companies to be more responsive to market conditions while restraining IT costs. Cloudflare’s decentralized vision of the web: InterPlanetary File System (IPFS) Gateway to create distributed websites Oath’s distributed network telemetry collector- ‘Panoptes’ is now Open source! Intelligent Edge Analytics: 7 ways machine learning is driving edge computing adoption in 2018
Read more
  • 0
  • 0
  • 5907
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at £15.99/month. Cancel anytime
article-image-how-has-ethical-hacking-benefited-the-software-industry
Fatema Patrawala
27 Sep 2019
8 min read
Save for later

How has ethical hacking benefited the software industry

Fatema Patrawala
27 Sep 2019
8 min read
In an online world infested with hackers, we need more ethical hackers. But all around the world, hackers have long been portrayed by the media and pop culture as the bad guys. Society is taught to see them as cyber-criminals and outliers who seek to destroy systems, steal data, and take down anything that gets in their way. There is no shortage of news, stories, movies, and television shows that outright villainize the hacker. From the 1995 movie Hackers, to the more recent Blackhat, hackers are often portrayed as outsiders who use their computer skills to inflict harm and commit crime. Read this: Did you know hackers could hijack aeroplane systems by spoofing radio signals? While there have been real-world, damaging events created by cyber-criminals that serve as the inspiration for this negative messaging, it is important to understand that this is only one side of the story. The truth is that while there are plenty of criminals with top-notch hacking and coding skills, there is also a growing and largely overlooked community of ethical (commonly known as white-hat) hackers who work endlessly to help make the online world a better and safer place. To put it lightly, these folks use their cyber superpowers for good, not evil. For example, Linus Torvalds, the creator of Linux was a hacker, as was Tim Berners-Lee, the man behind the World Wide Web. The list is long for the same reason the list of hackers turned coders is long – they all saw better ways of doing things. What is ethical hacking? According to the EC-Council, an ethical hacker is “an individual who is usually employed with an organization and who can be trusted to undertake an attempt to penetrate networks and/or computer systems using the same methods and techniques as a malicious hacker.” Listen: We discuss what it means to be a hacker with Adrian Pruteanu [Podcast] The role of an ethical hacker is important since the bad guys will always be there, trying to find cracks, backdoors, and other secret ways to access data they shouldn’t. Ethical hackers not only help expose flaws in systems, but they assist in repairing them before criminals even have a shot at exploiting said vulnerabilities. They are an essential part of the cybersecurity ecosystem and can often unearth serious unknown vulnerabilities in systems better than any security solution ever could. Certified ethical hackers make an average annual income of $99,000, according to Indeed.com. The average starting salary for a certified ethical hacker is $95,000, according to EC-Council senior director Steven Graham. Ways ethical hacking benefits the software industry Nowadays, ethical hacking has become increasingly mainstream and multinational tech giants like Google, Facebook, Microsoft, Mozilla, IBM, etc employ hackers or teams of hackers in order to keep their systems secure. And as a result of the success hackers have shown at discovering critical vulnerabilities, in the last year itself there has been a 26% increase in organizations running bug bounty programs, where they bolster their security defenses with hackers. Other than this there are a number of benefits that ethical hacking has provided to organizations majorly in the software industry. Carry out adequate preventive measures to avoid systems security breach An ethical hacker takes preventive measures to avoid security breaches, for example, they use port scanning tools like Nmap or Nessus to scan one’s own systems and find open ports. The vulnerabilities with each of the ports is studied, and remedial measures are taken by them. An ethical hacker will examine patch installations and make sure that they cannot be exploited. They also engage in social engineering concepts like dumpster diving—rummaging through trash bins for passwords, charts, sticky notes, or anything with crucial information that can be used to generate an attack. They also attempt to evade IDS (Intrusion Detection Systems), IPS (Intrusion Prevention systems), honeypots, and firewalls. They carry out actions like bypassing and cracking wireless encryption, and hijacking web servers and web applications. Perform penetration tests on networks at regular intervals One of the best ways to prevent illegal hacking is to test the network for weak links on a regular basis. Ethical hackers help clean and update systems by discovering new vulnerabilities on an on-going basis. Going a step ahead, ethical hackers also explore the scope of damage that can occur due to the identified vulnerability. This particular process is known as pen testing, which is used to identify network vulnerabilities that an attacker can target. There are many methods of pen testing. The organization may use different methods depending on its requirements. Any of the below pen testing methods can be carried out by an ethical hacker: Targeted testing which involves the organization's people and the hacker. The organization staff will be aware of the hacking being performed. External testing penetrates all externally exposed systems such as web servers and DNS. Internal testing uncovers vulnerabilities open to internal users with access privileges. Blind testing simulates real attacks from hackers. Testers are given limited information about the target, which requires them to perform reconnaissance prior to the attack. Pen testing is the strongest case for hiring ethical hackers. Ethical hackers have built computers and programs for software industry Going back to the early days of the personal computer, many of the members in the Silicon Valley would have been considered hackers in modern terms, that they pulled things apart and put them back together in new and interesting ways. This desire to explore systems and networks to find how it worked made many of the proto-hackers more knowledgeable about the different technologies and it can be safeguarded from malicious attacks. Just as many of the early computer enthusiasts turned out to be great at designing new computers and programs, many people who identify themselves as hackers are also amazing programmers. This trend of the hacker as the innovator has continued with the open-source software movement. Much of the open-source code is produced, tested and improved by hackers – usually during collaborative computer programming events, which are affectionately referred to as "hackathons." Even if you never touch a piece of open-source software, you still benefit from the elegant solutions that hackers come up with that inspire or are outright copied by proprietary software companies. Ethical hackers help safeguard customer information by preventing data breaches The personal information of consumers is the new oil of the digital world. Everything runs on data. But while businesses that collect and process consumer data have become increasingly valuable and powerful, recent events prove that even the world’s biggest brands are vulnerable when they violate their customers’ trust. Hence, it is of utmost importance for software businesses to gain the trust of customers by ensuring the security of their data. With high-profile data breaches seemingly in the news every day, “protecting businesses from hackers” has traditionally dominated the data privacy conversation. Read this: StockX confirms a data breach impacting 6.8 million customers In such a scenario, ethical hackers will prepare you for the worst, they will work in conjunction with the IT-response plan to ensure data security and in patching breaches when they do happen. Otherwise, you risk a disjointed, inconsistent and delayed response to issues or crises. It is also imperative to align how your organization will communicate with stakeholders. This will reduce the need for real-time decision-making in an actual crisis, as well as help limit inappropriate responses. They may also help in running a cybersecurity crisis simulation to identify flaws and gaps in your process, and better prepare your teams for such a pressure-cooker situation when it hits. Information security plan to create security awareness at all levels No matter how large or small your company is, you need to have a plan to ensure the security of your information assets. Such a plan is called a security program which is framed by information security professionals. Primarily the IT security team devises the security program but if done in coordination with the ethical hackers, they can provide the framework for keeping the company at a desired security level. Additionally by assessing the risks the company faces, they can decide how to mitigate them, and plan for how to keep the program and security practices up to date. To summarize… Many white hat hackers, gray hat and reformed black hat hackers have made significant contributions to the advancement of technology and the internet. In truth, hackers are almost in the same situation as motorcycle enthusiasts in that the existence of a few motorcycle gangs with real criminal operations tarnishes the image of the entire subculture. You don’t need to go out and hug the next hacker you meet, but it might be worth remembering that the word hacker doesn’t equal criminal, at least not all the time. Our online ecosystem is made safer, better and more robust by ethical hackers. As Keren Elazari, an ethical hacker herself, put it: “We need hackers, and in fact, they just might be the immune system for the information age. Sometimes they make us sick, but they also find those hidden threats in our world, and they make us fix it.” 3 cybersecurity lessons for e-commerce website administrators Hackers steal bitcoins worth $41M from Binance exchange in a single go! A security issue in the net/http library of the Go language affects all versions and all components of Kubernetes
Read more
  • 0
  • 0
  • 5906

article-image-dask-library-scalable-analytics-python
Amey Varangaonkar
22 May 2018
6 min read
Save for later

Introducing Dask: The library that makes scalable analytics in Python easier

Amey Varangaonkar
22 May 2018
6 min read
Python’s rise as the preferred language of choice in Data Science is unprecedented, but not really unexpected. Apart from being a general-purpose language which can be used for a variety of tasks - from scripting to networking, Python offers a rich suite of libraries for general data science tasks such as scientific computing, data visualization, and more. However, one big challenge faced by the data scientists is that these packages are not designed for scale. This is crucial in today’s Big Data era where tons of data needs to be processed and analyzed on the go. A platform which supports the existing Python ecosystem and allows it to scale across multiple machines and clusters without affecting the performance was conspicuously missing. Enter Dask. What is Dask? Dask is a flexible parallel computing library written in Python for analytics, designed mainly to offer scalability and enhanced power to the existing packages and libraries. It allows the users to integrate their existing Python-based projects written in popular libraries such as NumPy, SciPy, pandas, and more. Architecture is demonstrated in the diagram below: Architecture (Image courtesy: Slideshare) The 2 key components of Dask that interact with the Python libraries are: Dynamic task schedulers - which takes care of the intensive computational workloads ‘Big Data’ Dask collections - consisting of dataframes, parallel arrays and interfaces that allow for the computations to run on distributed environments Why use Dask? Given there are already quite a few distributed platforms for large-scale data processing such as Apache Spark, Apache Storm, Flink and so on, why and when should one go for Dask? What are the advantages offered by this Python library? Let us take a look at the 4 major reasons to prefer Dask for distributed, scalable analytics in Python: Easy to get started: If you are an existing Python user, you must have already worked with popular Python packages such as NumPy, SciPy, matplotlib, scikit-learn, pandas, and more. Dask offers a similar, intuitive interface and since it is a part of the bigger Python ecosystem, getting started with Dask is very easy. It uses the existing Python APIs to switch between the popular packages and their Dask-equivalents, so you don’t have to spend a lot of time in porting the code. For absolute beginners, using Dask for scalable analytics would be an easier and logical option to pursue, once they have grasped the fundamentals of Python and the associated libraries. Scales up and down quite easily: You can run your project on Dask on a single machine, or on a cluster with thousands of cores without essentially affecting the speed and performance of your code. Dask uses the multi-core CPUs within a single system optimally to process hundreds of terabytes of data without the need for additional hardware. Similarly, for moderate to large datasets spanning 100+ gigabytes which often don’t fit into a single storage device, the computing power of the clusters can be coupled with Dask for effective analytics. Supports complex applications: Many companies tend to tackle complex computations by introducing custom codes that run on popular Big Data tools such as Hadoop MapReduce and Apache Spark. However, with the help of the dynamic task schedule feature of Dask, it is now possible to run and process complex applications without introducing any additional code. Dask is solely responsible for the smooth handling of various tasks such as network communication, load balancing and diagnostics, among the others. Clear, responsive, real-time feedback: One of the most important features of Dask is its user-friendliness. Dask provides a real-time dashboard that highlights the key metrics of the processing task undertaken by the user - such as the current progress of your project, memory consumption and more. It also offers an in-built IPython kernel that allows the user to investigate the ongoing computation with just a terminal. How Dask compares with Apache Spark Apache Spark is one of the most popular and widely used Big Data tools for distributed data processing and analytics. Dask and Apache Spark have many features in common, prompting us and many other developers to ask the question - which tool is better? While Spark has been around for quite some and has many standard, stable features over years of development, Dask is quite new and is still being improved as a tool. We summarize the important differences between Dask and Apache Spark in the table below: CriteriaApache SparkDaskPrimary languageScalaPythonScaleSupports a single node to thousands of nodes in the clusterSupports a single node to thousands of nodes in the clusterEcosystemAll-in-one self-sufficient ecosystemIntegration with popular libraries within the Python ecosystemFlexibilityLowHighStream processingBuilt-in module called Spark Streaming presentReal-time interface which is pretty low-level, requires more work than Apache SparkGraph processingPossible with GraphX moduleNot possibleMachine learningUses the Spark MLlib moduleIntegrates with scikit-learn and XGBoostPopularityVery high, commonly used tool in the Big Data ecosystemFairly new tool but has already found its place in the pandas, scikit-learn and Jupyter stack   You can read a detailed comparison of Apache Spark and Dask on the official Dask documentation page. What we can expect from Dask As we saw from the comparison above, it is fairly easy to port an existing Python project using several high-profile Python libraries such as NumPy, scikit-learn and more. Python developers and data scientists will appreciate the high flexibility and complex computational capabilities offered by Dask. The limited stream processing and graph processing features are big areas of improvement, but we can expect some developments in this domain in the near future. Even though Dask is still relatively new, it looks very promising due to its close affinity with the Python ecosystem. With Python’s clout rising, many people would prefer a Python-based data processing tool which works at scale, without having to switch to an external Big Data framework. Dask may well be the superhero to come to the developers’ rescue, in such cases. You can learn more about the latest developments in Dask on their official GitHub page. Read more Is Apache Spark today’s Hadoop? Apache Spark 2.3 now has native Kubernetes support! Should you move to Python 3? 7 Python experts’ opinions
Read more
  • 0
  • 0
  • 5900

article-image-a-five-level-learning-roadmap-for-functional-programmers
Sugandha Lahoti
12 Apr 2019
4 min read
Save for later

A five-level learning roadmap for Functional Programmers

Sugandha Lahoti
12 Apr 2019
4 min read
The following guide serves as an excellent learning roadmap for functional programming. It can be used to track our level of knowledge regarding functional programming. This guide was developed for the Fantasyland institute of learning for the LambdaConf conference. It was designed for statically-typed functional programming languages that implement category theory. This post is extracted from the book Hands-On Functional Programming with TypeScript by Remo H. Jansen. In this book, you will understand the pros, cons, and core principles of functional programming in TypeScript. This roadmap talks about five levels of difficulty: Beginner, Advanced Beginner, Intermediate, Proficient, and Expert. Languages such as Haskell support category theory natively, but, we can take advantage of category theory in TypeScript by implementing it or using some third-party libraries. Not all the items in the list are 100% applicable to TypeScript due to language differences, but most of them are 100% applicable. Beginner To reach the beginner level, you will need to master the following concepts and skills: CONCEPTS SKILLS Immutable data Second-order functions Constructing and destructuring Function composition First-class functions and lambdas Use second-order functions (map, filter, fold) on immutable data structures Destructure values to access their components Use data types to represent optionality Read basic type signatures Pass lambdas to second-order functions Advanced beginner To reach the advanced beginner level, you will need to master the following concepts and skills: CONCEPTS SKILLS Algebraic data types Pattern matching Parametric polymorphism General recursion Type classes, instances, and laws Lower-order abstractions (equal, semigroup, monoid, and so on) Referential transparency and totality Higher-order functions Partial application, currying, and point-free style Solve problems without nulls, exceptions, or type casts Process and transform recursive data structures using recursion Able to use functional programming in the small Write basic monadic code for a concrete monad Create type class instances for custom data types Model a business domain with abstract data types (ADTs) Write functions that take and return functions Reliably identify and isolate pure code from an impure code Avoid introducing unnecessary lambdas and named parameters Intermediate To reach the intermediate level, you will need to master the following concepts and skills: CONCEPTS SKILLS Generalized algebraic data type Higher-kinded types Rank-N types Folds and unfolds Higher-order abstractions (category, functor, monad) Basic optics Implement efficient persistent data structures Existential types Embedded DSLs using combinators Able to implement large functional programming applications Test code using generators and properties Write imperative code in a purely functional way through monads Use popular purely functional libraries to solve business problems Separate decision from effects Write a simple custom lawful monad Write production medium-sized projects Use lenses and prisms to manipulate data Simplify types by hiding irrelevant data with existential Proficient To reach the proficient level, you will need to master the following concepts and skills: CONCEPTS SKILLS Codata (Co)recursion schemes Advanced optics Dual abstractions (comonad) Monad transformers Free monads and extensible effects Functional architecture Advanced functors (exponential, profunctors, contravariant) Embedded domain-specific languages (DSLs) using generalized algebraic datatypes (GADTs) Advanced monads (continuation, logic) Type families, functional dependencies (FDs) Design a minimally powerful monad transformer stack Write concurrent and streaming programs Use purely functional mocking in tests. Use type classes to modularly model different effects Recognize type patterns and abstract over them Use functional libraries in novel ways Use optics to manipulate state Write custom lawful monad transformers Use free monads/extensible effects to separate concerns Encode invariants at the type level. Effectively use FDs/type families to create safer code Expert To reach the expert level, you will need to master the following concepts and skills: CONCEPTS SKILLS High performance Kind polymorphism Generic programming Type-level programming Dependent-types, singleton types Category theory Graph reduction Higher-order abstract syntax Compiler design for functional languages Profunctor optics Design a generic, lawful library with broad appeal Prove properties manually using equational reasoning Design and implement a new functional programming language Create novel abstractions with laws Write distributed systems with certain guarantees Use proof systems to formally prove properties of code Create libraries that do not permit invalid states. Use dependent typing to prove more properties at compile time Understand deep relationships between different concepts Profile, debug, and optimize purely functional code with minimal sacrifices Summary This guide should be a good resource to guide you in your future functional-programming learning efforts. Read more on this in our book Hands-On Functional Programming with TypeScript. What makes functional programming a viable choice for artificial intelligence projects? Why functional programming in Python matters: Interview with best selling author, Steven Lott Introducing Coconut for making functional programming in Python simpler
Read more
  • 0
  • 0
  • 5873

article-image-future-fetcher-context-api-replace-redux
Amarabha Banerjee
13 Jul 2018
3 min read
Save for later

Is Future-Fetcher/Context API replacing Redux?

Amarabha Banerjee
13 Jul 2018
3 min read
In JSconf 2018, the former Redux head, Dan Abramov, announced a small tool that he built to simplify data fetching and state management. It was called the future-fetcher/React Context API. Redux, so far, is one of the most popular state management tools that is used widely with React. Even Angular users are quite fond of it. Vue also has a provision for using Redux in its ecosystem. However tools like Mobx are also getting enough popularity because of their simplicity and ease of use. What’s the problem with Redux? The honest answer is that it’s simply too complicated. If we have to understand the real probability of it being replaced by any other tool, then we will have to understand how it works. The workflow is illustrated in the below figure. Source: Freecodecamp.org The above image shows how basic Flux architecture functions and Redux is based quite heavily on this architecture model. This can be very complicated for a novice web developer. A beginner level developer might just get overwhelmed with the use of functional programming concepts like ‘creation’, ‘dispatcher’ and ‘Action’ functions and using them in appropriate situations. Redux follows the same application logic and those who are not comfortable with functional programming, might find using Redux quite cumbersome. That’s where the Future-Fetcher/ Context API comes in. Context API is a production-grade, efficient API that supports things like static type checking and deep updates. In React, different application levels and layers consist of React components and these components have nested relations with each other. In other words, they are connected to each other like a tree and if one component needs to change its state, and it has to pass on the information to the next component, then it transfers an entity called ‘prop’. The state management is important because you would want your application layers to be consistent with your data, so that when one component changes state, the relevant data has to passed on to the component which will allow it to respond accordingly. In Redux, you will have to write functions as mentioned above to implement this. But in context API, the architecture looks a bit different than the Redux-Flux architecture and there lies the difference. Source: Freecodecamp.org In case of Context API, the need to write functions like Action, Dispatch etc. vanishes, that makes the job of a developer quite easy. Here, we only have ‘view’ and the ‘store’ component, where “Store” contains the dynamic state of the application layers. This simplifies a lot of processes. Although the problem of scaling might be an issue in this particular form of architecture. Still, for normal web applications, where dynamic and real time behavior are important, Context API provides a much easier way of implementation. Since this feature has been developed by the primary architect of Redux, the developer community is of the opinion that it might face a tough challenge in the days to come. Still it’s early days to say - Game Over Redux. Creating Reusable Generic Modals in React and Redux Connecting React to Redux & Firebase – Part 1 Connecting React to Redux and Firebase – Part 2
Read more
  • 0
  • 0
  • 5870
article-image-why-triple-game-development-unsustainable
Raka Mahesa
12 Jun 2017
5 min read
Save for later

Why is triple-A game development unsustainable?

Raka Mahesa
12 Jun 2017
5 min read
The video game industry is a huge corporation that has brought in over $91 billion in revenue during 2016 alone. Not only big, it's also a growing industry with a projected yearly growth rate of 3.6%. So it's quite surprising when Cliff Bleszinski, a prominent figure in the game industry, made a remark that the business of modern triple-A games is unsustainable.  While the statement may sound "click-bait-y", he's not the only person from the industry to voice concern about the business model. Back in 2012, a game director from Ubisoft; one of the biggest game publishers in the world, made a similar remark about how the development of triple-A games could be harmful. Seeing how there is another person voicing a similar concern, maybe there's some truth to what they are saying. And if it is true, what makes triple-A game development unsustainable? Let's take a look. So, before we go further, let's first clear up one thing: what are triple-A games?  Triple-A games (or AAA games) are a tier of video games with the highest development budget. It's not a formal classification, so there isn't an exact budget limit that must be passed for a game to be categorized as triple-A. Additionally, even though this classification makes it seems like triple-A games are these super premium games of the highest quality; in reality, most games you find in a video game store are triple-A games, being sold at $60.  So that's the triple-A tier, but what other tiers of video games are there, and where are they sold? Well, there are indie games and double-A (AA) games. Indie games are made by a small team with a small budget and are sold at a price of $20 and lower. The double-A games are made with bigger budgets than indie games and sold at a higher price of $40. Both tiers of video games are sold digitally at a digital storefront like Steam and usually are not sold on a physical media like DVD.  Do keep in mind that this classification is for PC or console video games and isn't really applicable to mobile games.  Also, it is important to note that this classification of video games doesn't determine which game has the better quality or which one has the better sales. After all, Minecraft is an indie game with a really small initial development team that has sold over 100 million copies. In comparison, Grand Theft Auto V, a triple-A game with a $250 million development budget, has "only" sold 75 million copies.  And yes, you read that right. Grand Theft Auto V has a development cost of $250 million, with half of that cost being marketing. Most triple-A games don't have as much development budget, but they're still pretty expensive. Call of Duty: Modern Warfare 2 has a development cost of $200 million, The Witcher 3 has a development cost of $80 million, and the production cost (which means marketing cost is excluded) of Final Fantasy XIII is $65 million.  So, with that kind of budget, how do those games fare? Well, fortunately for Grand Theft Auto V, it made $1 billion in sales in just three days after it was released, making it the fastest-selling entertainment product of all time. Final Fantasy XIII has a different story though. Unlike Grand Theft Auto V with its 75 million sales number, the lifetime sales number of Final Fantasy XIII is only 6.6 million, which means it made roughly $350 million in sales, not much when compared to its production cost of $65 million.  And this is why triple-A game development is unsustainable. The development cost of those games is getting so high that the only way for the developer to gain profitability is to sell millions and millions of copies of those games. Meanwhile, each day there are more video games being released, making it harder for each game to gain sales. Grand Theft Auto V is the exception and not the rule here, since there aren't a lot of video games that can even reach 10 million in sales.  With that kind budget, the development of every triple-A game has become very risky. After all, if a game doesn't sell well, the developer could lose tens of millions of dollars, enough to bankrupt a small company that doesn't have much funding. And even for a big company with plenty of funding, how many projects could they fail on before they're forced to shut down?  And with risky projects comes risk mitigation. With so much money at stake, developers are forced to play safe and only work on games with mainstream appeal. Oh, the science fiction theme doesn’t have the audience as big as a military theme? Let’s only make games with a military theme then. But if all game developers think the same way, the video game market could end up with only a handful of genres, with all those developers competing for the same audience.  It’s a vicious cycle, really. High budget games need to have a high amount of sales to recoup its production cost. But for a game to get a high amount of sales, it needs to have high development budgets to compete with other games on the market.  So, if triple-A game development is truly unsustainable, would that mean those high budget games will disappear from the market in the future? Well, it's possible. But as we've seen with Minecraft, you don't need hundreds of millions in development budget to create a good game that will sell well. So even though the number of video games with high budgets may diminish, high quality video games will still exist.  About the Author  RakaMahesa is a game developer at Chocoarts: http://chocoarts.com/, who is interested in digital technology in general. Outside of work hours, he likes to work on his own projects, with Corridoom VR being his latest released game. Raka also regularly tweets as @legacy99. 
Read more
  • 0
  • 0
  • 5831

article-image-how-succeed-gaming-industry-10-tips
Raka Mahesa
12 Jun 2017
5 min read
Save for later

How to succeed in the gaming industry: 10 tips

Raka Mahesa
12 Jun 2017
5 min read
The gaming industry is a crowded trade. After all, it's one of those industries where you can work on something you actually love; so a lot of people are trying to get into it. And with a lot from rivals being successful in the industry is a difficult thing to accomplish. Here are 10 tips to help you succeed in the gaming enterprise. Do note that these are general tips. That way, the tips should be applicable to you regardless of your position in the industry, whether you're an indie developer working on your own games, or a programmer working for a big gaming company. Tip 1: Be creative The gaming industry is a creative one, so it makes perfect sense that you need to be creative to succeed. And you don't have to be an artist or a writer to apply your creative thinking; there are many challenges that need creative solutions. For example, a particular system in the game may need some heavy computing, so you can come up with a creative solution that instead of fully computing the problem, you merely use a simpler formula and estimate the result.  Tip 2: Capable of receiving criticism Video games are a passion for many people, probably including you. That's why it's easy to fall in love with your own idea; whether it’s a gameplay idea like how an enemy should behave, or a technical one like how a save file should be written. Your idea might not be perfect though, so it's important to be able to step back and see if another person's criticism on your idea has its merit. After all, that other person could be capable of seeing a flaw that you may have missed.  Tip 3: Be able to see the big picture A video game’s software is full of complex, interlocking systems. Being able to see the big picture, that is, seeing how changes in one system could affect another system is a really nice skill to have when developing a video game.  Tip 4: Keep up with technology Technology moves at a blisteringly fast speed. Technology that is relevant today may be rendered useless tomorrow, so it is very important to keep up with technology. Using the latest equipment may help your game project and the newest technology may provide opportunities for your games too. For example, newer platforms like VR and AR don't have many games yet, so it's easier to gain visibility there.  Tip 5: Keep up with industry trends It's not just technology that moves fast, but also the world. Just 10 years ago, it was unthinkable that millions of people would watch other people play games, or that mobile gaming would be bigger than console gaming. By keeping up with industry trends, we can understand the market for our games, and more importantly, understand our players' behavior.  Tip 6: Put yourself in your player's shoes Being able to see your games from the viewpoint of your player is a really useful skill to be had. For example, as a developer you may feel fine looking at a black screen when your game is loading its resources because you know the game is working fine, as long as it doesn't throw an error dialog. Whereas, your player probably doesn't feel the same way and thought the game just hangs when it shows you a black screen without a resource loading indicator.  Tip 7: Understand your platform and your audience This is a bit similar to the previous tip, but on a more general level. Each platform has different strengths and the audience of each platform also has different expectations. For example, games for mobile platforms are expected to be played in small time burst instead of hour long sessions, so mobile gamers expect their games to automatically save progress whenever they stop playing. Understanding this behavior is really important for developing games that satisfy players.  Tip 8: Be a team player Unless you're a one-man army, games usually are not developed alone. Since game development is a team effort, it's pretty important to get along with your teammates. Whether it's dividing tasks fairly with your programmer buddy, or explaining to the artist about the format of the art assets that your game needs.  Tip 9: Show your creation to other people When you are deep in the process of working on your latest creation, it’s hardsometimes to take a step back and assess your creation fairly. Occasionally you may even feel like your creations aren’t up to scratch. Fortunately, showing your work to other people is a relatively easy way to get good and honest feedback. And if you’re lucky, your new audience may just show you how your creation is actually to a standard level.  Tip 10: Networking This is probably the most generic tip ever, but that doesn't mean it's not true. In any industry and no matter what your position is, networking is really important. If you're an indie developer, you may connect with a development partner that shares the same vision as you. Alternatively, if you're a programmer, maybe you will connect with someone who’s looking for a senior position to lead a new game project. Networking will open the door of opportunities for you.  About the author  Raka Mahesa is a game developer at Chocoarts: http://chocoarts.com/, who is interested in digital technology in general. Outside of work hours, he likes to work on his own projects, with Corridoom VR being his latest released game. Raka also regularly tweets as @legacy99. 
Read more
  • 0
  • 0
  • 5816

article-image-brief-history-blockchain
Packt Editorial Staff
09 Apr 2018
6 min read
Save for later

A brief history of Blockchain

Packt Editorial Staff
09 Apr 2018
6 min read
History - where do we start? Blockchain was introduced with the invention of Bitcoin in 2008. Its practical implementation then occurred in 2009. Of course, both Blockchain and Bitcoin are very different, but you can't tell the full story behind the history of Blockchain without starting with Bitcoin. Electronic cash before Blockchain The concept of electronic cash or digital currency is not new. Since the 1980s, e-cash protocols have existed based on a model proposed by David Chaum. This is an extract from the new edition of Mastering Blockchain. Just as you need to understand the concept of distributed systems is to properly understand Blockchain, you also need to understand electronic cash. This concept pre-dates Blockchain and Bitcoin, but without it, we would certainly not be where we are today. Two fundamental e-cash system issues need to be addressed: accountability and anonymity. Accountability is required to ensure that cash is spendable only once (double-spend problem) and that it can only be spent by its rightful owner. Double spend problem arises when same money can be spent twice. As it is quite easy to make copies of digital data, this becomes a big issue in digital currencies as you can make many copies of same digital cash. Anonymity is required to protect users' privacy. As with physical cash, it is almost impossible to trace back spending to the individual who actually paid the money. David Chaum solved both of these problems during his work in the 1980s by using two cryptographic operations, namely blind signatures and secret sharing. Blind signatures allow for signing a document without actually seeing it, and secret sharing is a concept that enables the detection of double spending, that is using the same e-cash token twice (double spending). In 2009, the first practical implementation of an electronic cash (e-cash) system named Bitcoin appeared. The term cryptocurrency emerged later. For the very first time, it solved the problem of distributed consensus in a trustless network. It used public key cryptography with a Proof of Work (PoW) mechanism to provide a secure, controlled, and decentralized method of minting digital currency. The key innovation was the idea of an ordered list of blocks composed of transactions and cryptographically secured by the PoW mechanism. Other technologies that used something like a precursor to Bitcoin, include Merkle trees, hash functions, and hash chains. Looking at all the technologies mentioned earlier and their relevant history, it is easy to see how concepts from electronic cash schemes and distributed systems were combined to create Bitcoin and what now is known as Blockchain. This concept can also be visualized with the help of the following diagram: Blockchain and Sakoshi Nakamoto In 2008, a groundbreaking paper entitled Bitcoin: A Peer-to-Peer Electronic Cash System was written on the topic of peer-to-peer electronic cash under the pseudonym Satoshi Nakamoto. It introduced the term chain of blocks. No one knows the actual identity of Satoshi Nakamoto. After introducing Bitcoin in 2009, he remained active in the Bitcoin developer community until 2011. He then handed over Bitcoin development to its core developers and simply disappeared. Since then, there has been no communication from him whatsoever, and his existence and identity are shrouded in mystery. The term chain of blocks evolved over the years into the word Blockchain. Since that point, the history of Blockchain is really the history of its application in different industries. The most notable area is unsurprisingly within finance. Blockchain has been shown to improve the speed and security of financial transactions. While it hasn't yet become embedded in the mainstream of the financial sector, it surely only remains a matter of time before it begins to take hold. How it has evolved in recent years In Blockchain: Blueprint for a New Economy, Melanie Swann identifies three different tiers of Blockchain. These three tiers all showcase how Blockchain is currently evolving. It's worth noting that these various tiers or versions aren't simple chronological points in the history of Blockchain. The lines between each are blurred, and it ultimately depends on how Blockchain technology is being applied that different features and capabilities will be appear. Blockchain 1.0: This tier was introduced with the invention of Bitcoin, and it is primarily used for cryptocurrencies. Also, as Bitcoin was the first implementation of cryptocurrencies, it makes sense to categorize this first generation of Blockchain technology to include only cryptographic currencies. All alternative cryptocurrencies, as well as Bitcoin, fall into this category. It includes core applications such as payments and applications. This generation started in 2009 when Bitcoin was released and ended in early 2010. Blockchain 2.0: This second Blockchain generation is used by financial services and smart contracts. This tier includes various financial assets, such as derivatives, options, swaps, and bonds. Applications that go beyond currency, finance, and markets are incorporated at this tier. Ethereum, Hyperledger, and other newer Blockchain platforms are considered part of Blockchain 2.0. This generation started when ideas related to using blockchain for other purposes started to emerge in 2010. Blockchain 3.0: This third Blockchain generation is used to implement applications beyond the financial services industry and is used in government, health, media, the arts, and justice. Again, as in Blockchain 2.0, Ethereum, Hyperledger, and newer blockchains with the ability to code smart contracts are considered part of this blockchain technology tier. This generation of Blockchain emerged around 2012 when multiple applications of Blockchain technology in different industries were researched. Blockchain X.0: This generation represents a vision of Blockchain singularity where one day there will be a public Blockchain service available that anyone can use just like the Google search engine. It will provide services for all realms of society. It will be a public and open distributed ledger with general-purpose rational agents (Machina economicus) running on a Blockchain, making decisions, and interacting with other intelligent autonomous agents on behalf of people, and regulated by code instead of law or paper contracts. This does not mean that law and contracts will disappear, instead, law and contracts will be implementable in code. Like any history, this history of Blockchain isn't exhaustive. But it does hopefully give you an idea of how it has developed to where we are today. Check out this tutorial to write your first Blockchain program.
Read more
  • 0
  • 0
  • 5799
article-image-how-can-data-scientist-get-game-development
Graham Annett
07 Aug 2017
5 min read
Save for later

How can a data scientist get into game development?

Graham Annett
07 Aug 2017
5 min read
One of the most interesting uses for data science is within the aspects and process around game development.  While not immediately obvious that data science can be applicable to game development, it is increasingly becoming an enticing area both from a user engagement perspective, and as a source of data collection for deep learning and data science related tasks. Games and data collection  With the increase of reinforcement learning oriented deep learning tasks in the past few years, the concept of using games as a method for collection of data (somewhat in parallel to collecting data on mturk or various other crowdsourcing platforms) has never been greater.  The main idea behind data collection for these types of tasks is capturing the graphical display at some time and recording the user input for that image frame.  From this data, it's possible to connect these inputs into some end result (such that the final score) that can later be optimized and used as an objective cost function to be minimized or maximized.  With this, it’s possible to collect a large corpus of a user's data for deep learning algorithms to initially train off of, which they can then use for the computer to play itself (something akin to this was done for AlphaGo and various other game related reinforcement learning bots). With the incredible influx of processing power now available, it’s possible for computers to play themselves thousands and millions of times to learn from themselves and their own shortcomings.  Deep learning uses  Practical uses of this type of deep learning that a data scientist may find interesting range from creating smart AI systems that are more engaging to a player, to finding transferable algorithms and data sources that can be used elsewhere. For example, many of the OpenAI algorithms are intended to be trained in one game with the hope that they will be transferable to another game and still do well (albeit with new parameters and learned cost function). This type of deep learning is incredibly interesting from a data scientist perspective because it is useful to not have to focus on highly optimizing each game or task that a data scientist may be working on and instead find commonalities and generalizable methodologies that translate across systems and games.  Technical skills Many of the technical skills for creating pipelines of data collection from game development are much more development oriented than a traditional data scientist may be used to, and it may require learning new skills. These skills are much broader and encompassing of traditional developer roles, and initially include things such as data collection and data pipelining from the games, to scaling deep learning training and implementing new algorithms during training.These are becoming more vital to a data scientist as the need to both provide insight as well as create integrations into a product is becoming an incredibly vital skillset.  Exploring projects and tools  A data scientist may go about getting into this area by exploring such projects and tools such as OpenAI’s gym and Facebook's MazeBase. These projects are very deep learning oriented though, and may not be what a traditional data scientist thinks of when they are interested in game development.  Data oriented/driven game design Another approach is data oriented/driven game design. While this is not a new concept by any means, it has become increasingly ubiquitous as in-app purchasing and subscription based gaming plans have become a common theme among mobile and other gaming platforms. These types of data science tasks are not unlike normal data science related projects, in that they seek to understand from a statistical perspective what is happening to users at specific points along the games. There is a pretty big overlap in projects like this and projects that aim to understand, for instance, when a user abandons a cart during an online order. The data for the games may be oriented around when the gamer gave up on a quest, or at what point users are willing to make an in-app purchase to quicker achieve a goal. Since these are quantifiable and objective goals, they are an incredibly fit for traditional supervised learning tasks and can be approached with traditional supervised learning baselines and algorithms.  The end result of these tasks may include things such as making a quest or goal easier, or making an in-app purchase cheaper during some specific interval that the user would be more inclined to purchase (much like offering a user a coupon if a cart is abandoned during checkout often entices the user to come back and finish the purchase). While both of these paths are game development oriented, they differ quite a lot in that one is much more traditionally data analytical, and one is much more deep learning engineering oriented. They both are highly interesting areas to explore from a professional standpoint, but data driven game development may be somewhat limited from a hobbyist standpoint outside of Kaggle competitions (which a quick search didn’t seem to show any previous competitions having this sort of data) since many companies would be quite hesitant to provide this sort of data if their entire business model is based around in-app purchases and recurring revenue from players.  Overall, these are both incredibly enticing areas and are great avenues to pursue and provide plenty of interesting problems that you may not encounter outside of game development.  About the Author Graham Annett is an NLP Engineer at Kip (Kipthis.com).  He has been interested in deep learning for a bit over a year and has worked with and contributed to Keras (https://github.com/fchollet/keras).  He can be found on Github at http://github.com/grahamannett or via http://grahamannett.me. 
Read more
  • 0
  • 0
  • 5791

article-image-7-tips-python-performance
Gabriel Marcondes
24 Jun 2016
7 min read
Save for later

7 Tips For Python Performance

Gabriel Marcondes
24 Jun 2016
7 min read
When you begin using Python after using other languages, it's easy to bring a lot of idioms with you. Though they may work, they are not the best, most beautiful, or fastest ways to get things done with Python—they're not pythonic. I've put together some tips on basic things that can provide big performance improvements, and I hope they'll serve as a starting point for you as you develop with Python. Use comprehensions Comprehensions are great. Python knows how to make lists, tuples, sets, and dicts from single statements, so you don't need to declare, initialize, and append things to your sequences as you do in Java. It helps not only in readability but also on performance; if you delegate something to the interpreter, it will make it faster. def do_something_with(value): return value * 2 # this is an anti-pattern my_list = [] for value in range(10): my_list.append(do_something_with(value)) # this is beautiful and faster my_list = [do_something_with(value) for value in range(10)] # and you can even plug some validation def some_validation(value): return value % 2 my_list = [do_something_with(value) for value in range(10) if some_validation(value)] my_list [2, 6, 10, 14, 18] And it looks the same for other types. You just need to change the appropriate surrounding symbols to get what you want. my_tuple = tuple(do_something_with(value) for value in range(10)) my_tuple (0, 2, 4, 6, 8, 10, 12, 14, 16, 18) my_set = {do_something_with(value) for value in range(10)} my_set {0, 2, 4, 6, 8, 10, 12, 14, 16, 18} my_dict = {value: do_something_with(value) for value in range(10)} my_dict {0: 0, 1: 2, 2: 4, 3: 6, 4: 8, 5: 10, 6: 12, 7: 14, 8: 16, 9: 18} Use Generators Generators are objects that generate sequences one item at a time. This provides a great gain in performance when working with large data, because it won't generate the whole sequence unless needed, and it’s also a memory saver. A simple way to use generators is very similar to the comprehensions we saw above, but you encase the sentence with () instead of [] for example: my_generator = (do_something_with(value) for value in range(10)) my_generator <generator object <genexpr> at 0x7f0d31c207e0> The range function itself returns a generator (unless you're using legacy Python 2, in which case you need to use xrange). Once you have a generator, call next to iterate over its items, or use it as a parameter to a sequence constructor if you really need all the values: next(my_generator) 0 next(my_generator) 2 To create your own generators, use the yield keyword inside a loop instead of a regular return at the end of a function or method. Each time you call next on it, your code will run until it reaches a yield statement, and it saves the state for the next time you ask for a value. # this is a generator that infinitely returns a sequence of numbers, adding 1 to the previous def my_generator_creator(start_value): while True: yield start_value start_value += 1 my_integer_generator = my_generator_creator(0) my_integer_generator <generator object my_generator_creator at 0x7f0d31c20708> next(my_integer_generator) 0 next(my_integer_generator) 1 The benefits of generators in this case are obvious—you would never end up generating numbers if you were to create the whole sequence before using it. A great use of this, for example, is for reading a file stream. Use sets for membership checks It's wonderful how we can use the in keyword to check for membership on any type of sequence. But sets are special. They're of a mapping kind, an unordered set of values where an item's positions are calculated rather than searched. When you search for a value inside a list, the interpreter searches the entire list to see whether the value is there. If you are lucky, the value is the first of the sequence; on the other hand, it could even be the last in a long list. When working with sets, the membership check always takes the same time because the positions are calculated and the interpreter knows where to search for the value. If you're using long sets or loops, the performance gain is sensible. You can create a set from any iterable object as long as the values are hashable. my_list_of_fruits = ['apple', 'banana', 'coconut', 'damascus'] my_set_of_fruits = set(my_list_of_fruits) my_set_of_fruits {'apple', 'banana', 'coconut', 'damascus'} 'apple' in my_set_of_fruits True 'watermelon' in my_set_of_fruits False Deal with strings the right way You've probably done or read something like this before: # this is an anti-pattern "some string, " + "some other string, " + "and yet another one" 'some string, some other string, and yet another one' It might look easy and fast to write, but it's terrible for your performance. str objects are immutable, so each time you add strings, trying to append them, you're actually creating new strings. There are a handful of methods to deal with strings in a faster and optimal way. To join strings, use the join method on a separator with a sequence of strings. The separator can be an empty string if you just want to concatenate. # join a sequence of strings, based on the separator you want ", ".join(["some string", "some other string", "and yet another one"]) 'some string, some other string, and yet another one' ''.join(["just", "concatenate"]) 'justconcatenate' To merge strings, for example, to insert information in templates, we have a classical way; it resembles the C language: # the classical way "hello %s, my name is %s" % ("everyone", "Gabriel") 'hello everyone, my name is Gabriel' And then there is the modern way, with the format method. It is quite flexible: # formatting with sequencial strings "hello {}, my name is {}".format("everyone", "Gabriel") 'hello everyone, my name is Gabriel' # formatting with indexed strings "hello {1}, my name is {2} and we all love {0}".format("Python", "everyone", "Gabriel") 'hello everyone, my name is Gabriel and we all love Python' Avoid intermediate outputs Every programmer in this world has used print statements for debugging or progress checking purposes at least once. If you don't know pdb for debugging yet, you should check it out immediately. But I'll agree that it's really easy to write print statements inside your loops to keep track of where your program is. I'll just tell you to avoid them because they're synchronous and will significantly raise the execution time. You can think of alternative ways to check progress, such as watching via the filesystem the files that you have to generate anyway. Asynchronous programming is a huge topic that you should take a look at if you're dealing with a lot of I/O operations. Cache the most requested results Caching is one of the greatest performance tunning tweaks you'll ever find. Python gives us a handy way of caching function calls with a simple decorator, functools.lru_cache. Each time you call a function that is decorated with lru_cache, the interpreter checks whether that call was made recently, on a cache that is a dictionary of parameters-result pairs. dict checks are as fast as those of set, and if we have repetitive calls, it's worth looking at this cache before running the code again. from functools import lru_cache @lru_cache(maxsize=16) def my_repetitive_function(value): # pretend this is an extensive calculation return value * 2 for value in range(100): my_repetitive_function(value % 8) The decorator gives the method cache_info, where we can find the statistics about the cache. We can see the eight misses (for the eight times the function was really called), and 92 hits. As we only have eight different inputs (because of the % 8 thing), the cache size was never fully filled. my_repetitive_function.cache_info() CacheInfo(hits=92, misses=8, maxsize=16, currsize=8) Read In addition to these six tips, read a lot, every day. Read books and other people's code. Make code and talk about code. Time, practice, and exchanging experiences will make you a great programmer, and you'll naturally write better Python code. About the author Gabriel Marcondes is a computer engineer working on Django and Python in São Paulo, Brazil. When he is not coding, you can find him at @ggzes, talking about rock n' roll, football, and his attempts to send manned missions to fictional moons.
Read more
  • 0
  • 0
  • 5755