Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds

How-To Tutorials

6719 Articles
article-image-f-for-net-core-application-development-tutorial
Aaron Lazar
16 Aug 2018
17 min read
Save for later

Getting started with F# for .Net Core application development [Tutorial]

Aaron Lazar
16 Aug 2018
17 min read
F# is Microsoft's purely functional programming language, that can be used along with the .NET Core framework. In this article, we will get introduced to F# to leverage .NET Core for our application development. This article is extracted from the book, .NET Core 2.0 By Example, written by Rishabh Verma and Neha Shrivastava. Basics of classes Classes are types of object which can contain functions, properties, and events. An F# class must have a parameter and a function attached like a member. Both properties and functions can use the member keyword. The following is the class definition syntax: type [access-modifier] type-name [type-params] [access-modifier] (parameter-list) [ as identifier ] = [ class ] [ inherit base-type-name(base-constructor-args) ] [ let-bindings ] [ do-bindings ] member-list [ end ] // Mutually recursive class definitions: type [access-modifier] type-name1 ... and [access-modifier] type-name2 ... Let’s discuss the preceding syntax for class declaration: type: In the F# language, class definition starts with a type keyword. access-modifier: The F# language supports three access modifiers—public, private, and internal. By default, it considers the public modifier if no other access modifier is provided. The Protected keyword is not used in the F# language, and the reason is that it will become object oriented rather than functional programming. For example, F# usually calls a member using a lambda expression and if we make a member type protected and call an object of a different instance, it will not work. type-name: It is any of the previously mentioned valid identifiers; the default access modifier is public. type-params: It defines optional generic type parameters. parameter-list: It defines constructor parameters; the default access modifier for the primary constructor is public. identifier: It is used with the optional as keyword, the as keyword gives a name to an instance variable which can be used in the type definition to refer to the instance of the type. Inherit: This keyword allows us to specify the base class for a class. let-bindings: This is used to declare fields or function values in the context of a class. do-bindings: This is useful for the execution of code to create an object member-list: The member-list comprises extra constructors, instance and static method declarations, abstract bindings, interface declarations, and event and property declarations. Here is an example of a class: type StudentName(firstName,lastName) = member this.FirstName = firstName member this.LastName = lastName In the previous example, we have not defined the parameter type. By default, the program considers it as a string value but we can explicitly define a data type, as follows: type StudentName(firstName:string,lastName:string) = member this.FirstName = firstName member this.LastName = lastName Constructor of a class In F#, the constructor works in a different way to any other .NET language. The constructor creates an instance of a class. A parameter list defines the arguments of the primary constructor and class. The constructor contains let and do bindings, which we will discuss next. We can add multiple constructors, apart from the primary constructor, using the new keyword and it must invoke the primary constructor, which is defined with the class declaration. The syntax of defining a new constructor is as shown: new (argument-list) = constructor-body Here is an example to explain the concept. In the following code, the StudentDetail class has two constructors: a primary constructor that takes two arguments and another constructor that takes no arguments: type StudentDetail(x: int, y: int) = do printfn "%d %d" x y new() = StudentDetail(0, 0) A let and do binding A let and do binding creates the primary constructor of a class and runs when an instance of a class is created. A function is compiled into a member if it has a let binding. If the let binding is a value which is not used in any function or member, then it is compiled into a local variable of a constructor; otherwise, it is compiled into a field of the class. The do expression executes the initialized code. As any extra constructors always call the primary constructor, let and do bindings always execute, irrespective of which constructor is called. Fields that are created by let bindings can be accessed through the methods and properties of the class, though they cannot be accessed from static methods, even if the static methods take an instance variable as a parameter: type Student(name) as self = let data = name do self.PrintMessage() member this.PrintMessage() = printf " Student name is %s" data Generic type parameters F# also supports a generic parameter type. We can specify multiple generic type parameters separated by a comma. The syntax of a generic parameter declaration is as follows: type MyGenericClassExample<'a> (x: 'a) = do printfn "%A" x The type of the parameter infers where it is used. In the following code, we call the MyGenericClassExample method and pass a sequence of tuples, so here the parameter type became a sequence of tuples: let g1 = MyGenericClassExample( seq { for i in 1 .. 10 -> (i, i*i) } ) Properties Values related to an object are represented by properties. In object-oriented programming, properties represent data associated with an instance of an object. The following snippet shows two types of property syntax: // Property that has both get and set defined. [ attributes ] [ static ] member [accessibility-modifier] [self- identifier.]PropertyName with [accessibility-modifier] get() = get-function-body and [accessibility-modifier] set parameter = set-function-body // Alternative syntax for a property that has get and set. [ attributes-for-get ] [ static ] member [accessibility-modifier-for-get] [self-identifier.]PropertyName = get-function-body [ attributes-for-set ] [ static ] member [accessibility-modifier-for-set] [self- identifier.]PropertyName with set parameter = set-function-body There are two kinds of property declaration: Explicitly specify the value: We should use the explicit way to implement the property if it has non-trivial implementation. We should use a member keyword for the explicit property declaration. Automatically generate the value: We should use this when the property is just a simple wrapper for a value. There are many ways of implementing an explicit property syntax based on need: Read-only: Only the get() method Write-only: Only the set() method Read/write: Both get() and set() methods An example is shown as follows: // A read-only property. member this.MyReadOnlyProperty = myInternalValue // A write-only property. member this.MyWriteOnlyProperty with set (value) = myInternalValue <- value // A read-write property. member this.MyReadWriteProperty with get () = myInternalValue and set (value) = myInternalValue <- value Backing stores are private values that contain data for properties. The keyword, member val instructs the compiler to create backing stores automatically and then gives an expression to initialize the property. The F# language supports immutable types, but if we want to make a property mutable, we should use get and set. As shown in the following example, the MyClassExample class has two properties: propExample1 is read-only and is initialized to the argument provided to the primary constructor, and propExample2 is a settable property initialized with a string value ".Net Core 2.0": type MyClassExample(propExample1 : int) = member val propExample1 = property1 member val propExample2 = ".Net Core 2.0" with get, set Automatically implemented properties don't work efficiently with some libraries, for example, Entity Framework. In these cases, we should use explicit properties. Static and instance properties There can be further categorization of properties as static or instance properties. Static, as the name suggests, can be invoked without any instance. The self-identifier is neglected by the static property while it is necessary for the instance property. The following is an example of the static property: static member MyStaticProperty with get() = myStaticValue and set(value) = myStaticValue <- value Abstract properties Abstract properties have no implementation and are fully abstract. They can be virtual. It should not be private and if one accessor is abstract all others must be abstract. The following is an example of the abstract property and how to use it: // Abstract property in abstract class. // The property is an int type that has a get and // set method [<AbstractClass>] type AbstractBase() = abstract Property1 : int with get, set // Implementation of the abstract property type Derived1() = inherit AbstractBase() let mutable value = 10 override this.Property1 with get() = value and set(v : int) = value <- v // A type with a "virtual" property. type Base1() = let mutable value = 10 abstract Property1 : int with get, set default this.Property1 with get() = value and set(v : int) = value <- v // A derived type that overrides the virtual property type Derived2() = inherit Base1() let mutable value2 = 11 override this.Property1 with get() = value2 and set(v) = value2 <- v Inheritance and casts In F#, the inherit keyword is used while declaring a class. The following is the syntax: type MyDerived(...) = inherit MyBase(...) In a derived class, we can access all methods and members of the base class, but it should not be a private member. To refer to base class instances in the F# language, the base keyword is used. Virtual methods and overrides  In F#, the abstract keyword is used to declare a virtual member. So, here we can write a complete definition of the member as we use abstract for virtual. F# is not similar to other .NET languages. Let's have a look at the following example: type MyClassExampleBase() = let mutable x = 0 abstract member virtualMethodExample : int -> int default u. virtualMethodExample (a : int) = x <- x + a; x type MyClassExampleDerived() = inherit MyClassExampleBase () override u. virtualMethodExample (a: int) = a + 1 In the previous example, we declared a virtual method, virtualMethodExample, in a base class, MyClassExampleBase, and overrode it in a derived class, MyClassExampleDerived. Constructors and inheritance An inherited class constructor must be called in a derived class. If a base class constructor contains some arguments, then it takes parameters of the derived class as input. In the following example, we will see how derived class arguments are passed in the base class constructor with inheritance: type MyClassBase2(x: int) = let mutable z = x * x do for i in 1..z do printf "%d " i type MyClassDerived2(y: int) = inherit MyClassBase2(y * 2) do for i in 1..y do printf "%d " i If a class has multiple constructors, such as new(str) or new(), and this class is inherited in a derived class, we can use a base class constructor to assign values. For example, DerivedClass, which inherits BaseClass, has new(str1,str2), and in place of the first string, we pass inherit BaseClass(str1). Similarly for blank, we wrote inherit BaseClass(). Let's explore the following example for more detail: type BaseClass = val string1 : string new (str) = { string1 = str } new () = { string1 = "" } type DerivedClass = inherit BaseClass val string2 : string new (str1, str2) = { inherit BaseClass(str1); string2 = str2 } new (str2) = { inherit BaseClass(); string2 = str2 } let obj1 = DerivedClass("A", "B") let obj2 = DerivedClass("A") Functions and lambda expressions A lambda expression is one kind of anonymous function, which means it doesn't have a name attached to it. But if we want to create a function which can be called, we can use the fun keyword with a lambda expression. We can pass the input parameter in the lambda function, which is created using the fun keyword. This function is quite similar to a normal F# function. Let's see a normal F# function and a lambda function: // Normal F# function let addNumbers a b = a+b // Evaluating values let sumResult = addNumbers 5 6 // Lambda function and evaluating values let sumResult = (fun (a:int) (b:int) -> a+b) 5 6 // Both the function will return value sumResult = 11 Handling data – tuples, lists, record types, and data manipulation F# supports many data types, for example: Primitive types: bool, int, float, string values. Aggregate type: class, struct, union, record, and enum Array: int[], int[ , ], and float[ , , ] Tuple: type1 * type2 * like (a,1,2,true) type is—char * int * int * bool Generic: list<’x>, dictionary < ’key, ’value> In an F# function, we can pass one tuple instead of multiple parameters of different types. Declaration of a tuple is very simple and we can assign values of a tuple to different variables, for example: let tuple1 = 1,2,3 // assigning values to variables , v1=1, v2= 2, v3=3 let v1,v2,v3 = tuple1 // if we want to assign only two values out of three, use “_” to skip the value. Assigned values: v1=1, //v3=3 let v1,_,v3 = tuple In the preceding examples, we saw that tuple supports pattern matching. These are option types and an option type in F# supports the idea that the value may or not be present at runtime. List List is a generic type implementation. An F# list is similar to a linked list implementation in any other functional language. It has a special opening and closing bracket construct, a short form of the standard empty list ([ ]) syntax: let empty = [] // This is an empty list of untyped type or we can say //generic type. Here type is: 'a list let intList = [10;20;30;40] // this is an integer type list The cons operator is used to prepend an item to a list using a double colon cons(prepend,::). To append another list to one list, we use the append operator—@: // prepend item x into a list let addItem xs x = x :: xs let newIntList = addItem intList 50 // add item 50 in above list //“intlist”, final result would be- [50;10;20;30;40] // using @ to append two list printfn "%A" (["hi"; "team"] @ ["how";"are";"you"]) // result – ["hi"; "team"; "how";"are";"you"] Lists are decomposable using pattern matching into a head and a tail part, where the head is the first item in the list and the tail part is the remaining list, for example: printfn "%A" newIntList.Head printfn "%A" newIntList.Tail printfn "%A" newIntList.Tail.Tail.Head let rec listLength (l: 'a list) = if l.IsEmpty then 0 else 1 + (listLength l.Tail) printfn "%d" (listLength newIntList) Record type The class, struct, union, record, and enum types come under aggregate types. The record type is one of them, it can have n number of members of any individual type. Record type members are by default immutable but we can make them mutable. In general, a record type uses the members as an immutable data type. There is no way to execute logic during instantiation as a record type don't have constructors. A record type also supports match expression, depending on the values inside those records, and they can also again decompose those values for individual handling, for example: type Box = {width: float ; height:int } let giftbox = {width = 6.2 ; height = 3 } In the previous example, we declared a Box with float a value width and an integer height. When we declare giftbox, the compiler automatically detects its type as Box by matching the value types. We can also specify type like this: let giftbox = {Box.width = 6.2 ; Box.height = 3 } or let giftbox : Box = {width = 6.2 ; height = 3 } This kind of type declaration is used when we have the same type of fields or field type declared in more than one type. This declaration is called a record expression. Object-oriented programming in F# F# also supports implementation inheritance, the creation of object, and interface instances. In F#, constructed types are fully compatible .NET classes which support one or more constructors. We can implement a do block with code logic, which can run at the time of class instance creation. The constructed type supports inheritance for class hierarchy creation. We use the inherit keyword to inherit a class. If the member doesn't have implementation, we can use the abstract keyword for declaration. We need to use the abstractClass attribute on the class to inform the compiler that it is abstract. If the abstractClass attribute is not used and type has all abstract members, the F# compiler automatically creates an interface type. Interface is automatically inferred by the compiler as shown in the following screenshot: The override keyword is used to override the base class implementation; to use the base class implementation of the same member, we use the base keyword. In F#, interfaces can be inherited from another interface. In a class, if we use the construct interface, we have to implement all the members in the interface in that class, as well. In general, it is not possible to use interface members from outside the class instance, unless we upcast the instance type to the required interface type. To create an instance of a class or interface, the object expression syntax is used. We need to override virtual members if we are creating a class instance and need member implementation for interface instantiation: type IExampleInterface = abstract member IntValue: int with get abstract member HelloString: unit -> string type PrintValues() = interface IExampleInterface with member x.IntValue = 15 member x.HelloString() = sprintf "Hello friends %d" (x :> IExampleInterface).IntValue let example = let varValue = PrintValues() :> IExampleInterface { new IExampleInterface with member x.IntValue = varValue.IntValue member x.HelloString() = sprintf "<b>%s</b>" (varValue.HelloString()) } printfn "%A" (example.HelloString()) Exception handling The exception keyword is used to create a custom exception in F#; these exceptions adhere to Microsoft best practices, such as constructors supplied, serialization support, and so on. The keyword raise is used to throw an exception. Apart from this, F# has some helper functions, such as failwith, which throws a failure exception at F# runtime, and invalidop, invalidarg, which throw the .NET Framework standard type invalid operation and invalid argument exception, respectively. try/with is used to catch an exception; if an exception occurred on an expression or while evaluating a value, then the try/with expression could be used on the right side of the value evaluation and to assign the value back to some other value. try/with also supports pattern matching to check an individual exception type and extract an item from it. try/finally expression handling depends on the actual code block. Let's take an example of declaring and using a custom exception: exception MyCustomExceptionExample of int * string raise (MyCustomExceptionExample(10, "Error!")) In the previous example, we created a custom exception called MyCustomExceptionExample, using the exception keyword, passing value fields which we want to pass. Then we used the raise keyword to raise exception passing values, which we want to display while running the application or throwing the exception. However, as shown here, while running this code, we don't get our custom message in the error value and the standard exception message is displayed: We can see in the previous screenshot that the exception message doesn't contain the message that we passed. In order to display our custom error message, we need to override the standard message property on the exception type. We will use pattern matching assignment to get two values and up-cast the actual type, due to the internal representation of the exception object. If we run this program again, we will get the custom message in the exception: exception MyCustomExceptionExample of int * string with override x.Message = let (MyCustomExceptionExample(i, s)) = upcast x sprintf "Int: %d Str: %s" i s raise (MyCustomExceptionExample(20, "MyCustomErrorMessage!")) Now, we will get the following error message: In the previous screenshot, we can see our custom message with integer and string values included in the output. We can also use the helper function, failwith, to raise a failure exception, as it includes our message as an error message, as follows: failwith "An error has occurred" The preceding error message can be seen in the following screenshot: Here is a detailed exception screenshot: An example of the invalidarg helper function follows. In this factorial function, we are checking that the value of x is greater than zero. For cases where x is less than 0, we call invalidarg, pass x as the parameter name that is invalid, and then some error message saying the value should be greater than 0. The invalidarg helper function throws an invalid argument exception from the standard system namespace in .NET: let rec factorial x = if x < 0 then invalidArg "x" "Value should be greater than zero" match x with | 0 -> 1 | _ -> x * (factorial (x - 1)) By now, you should be pretty familiar with the F# programming language, to use in your application development, alongside C#. If you found this tutorial helpful and you're interested in learning more, head over to this book .NET Core 2.0 By Example, by Rishabh Verma and Neha Shrivastava. .NET Core completes move to the new compiler – RyuJIT Applying Single Responsibility principle from SOLID in .NET Core Unit Testing in .NET Core with Visual Studio 2017 for better code quality
Read more
  • 0
  • 0
  • 3871

article-image-multithreading-in-rust-using-crates-tutorial
Aaron Lazar
15 Aug 2018
17 min read
Save for later

Multithreading in Rust using Crates [Tutorial]

Aaron Lazar
15 Aug 2018
17 min read
The crates.io ecosystem in Rust can make use of approaches to improve our development speed as well as the performance of our code. In this tutorial, we'll learn how to use the crates ecosystem to manipulate threads in Rust. This article is an extract from Rust High Performance, authored by Iban Eguia Moraza. Using non-blocking data structures One of the issues we saw earlier was that if we wanted to share something more complex than an integer or a Boolean between threads and if we wanted to mutate it, we needed to use a Mutex. This is not entirely true, since one crate, Crossbeam, allows us to use great data structures that do not require locking a Mutex. They are therefore much faster and more efficient. Often, when we want to share information between threads, it's usually a list of tasks that we want to work on cooperatively. Other times, we want to create information in multiple threads and add it to a list of information. It's therefore not so usual for multiple threads to be working with exactly the same variables since as we have seen, that requires synchronization and it will be slow. This is where Crossbeam shows all its potential. Crossbeam gives us some multithreaded queues and stacks, where we can insert data and consume data from different threads. We can, in fact, have some threads doing an initial processing of the data and others performing a second phase of the processing. Let's see how we can use these features. First, add crossbeam to the dependencies of the crate in the Cargo.toml file. Then, we start with a simple example: extern crate crossbeam; use std::thread; use std::sync::Arc; use crossbeam::sync::MsQueue; fn main() { let queue = Arc::new(MsQueue::new()); let handles: Vec<_> = (1..6) .map(|_| { let t_queue = queue.clone(); thread::spawn(move || { for _ in 0..1_000_000 { t_queue.push(10); } }) }) .collect(); for handle in handles { handle.join().unwrap(); } let final_queue = Arc::try_unwrap(queue).unwrap(); let mut sum = 0; while let Some(i) = final_queue.try_pop() { sum += i; } println!("Final sum: {}", sum); } Let's first understand what this example does. It will iterate 1,000,000 times in 5 different threads, and each time it will push a 10 to a queue. Queues are FIFO lists, first input, first output. This means that the first number entered will be the first one to pop() and the last one will be the last to do so. In this case, all of them are a 10, so it doesn't matter. Once the threads finish populating the queue, we iterate over it and we add all the numbers. A simple computation should make you able to guess that if everything goes perfectly, the final number should be 50,000,000. If you run it, that will be the result, and that's not all. If you run it by executing cargo run --release, it will run blazingly fast. On my computer, it took about one second to complete. If you want, try to implement this code with the standard library Mutex and vector, and you will see that the performance difference is amazing. As you can see, we still needed to use an Arc to control the multiple references to the queue. This is needed because the queue itself cannot be duplicated and shared, it has no reference count. Crossbeam not only gives us FIFO queues. We also have LIFO stacks. LIFO comes from last input, first output, and it means that the last element you inserted in the stack will be the first one to pop(). Let's see the difference with a couple of threads: extern crate crossbeam; use std::thread; use std::sync::Arc; use std::time::Duration; use crossbeam::sync::{MsQueue, TreiberStack}; fn main() { let queue = Arc::new(MsQueue::new()); let stack = Arc::new(TreiberStack::new()); let in_queue = queue.clone(); let in_stack = stack.clone(); let in_handle = thread::spawn(move || { for i in 0..5 { in_queue.push(i); in_stack.push(i); println!("Pushed :D"); thread::sleep(Duration::from_millis(50)); } }); let mut final_queue = Vec::new(); let mut final_stack = Vec::new(); let mut last_q_failed = 0; let mut last_s_failed = 0; loop { // Get the queue match queue.try_pop() { Some(i) => { final_queue.push(i); last_q_failed = 0; println!("Something in the queue! :)"); } None => { println!("Nothing in the queue :("); last_q_failed += 1; } } // Get the stack match stack.try_pop() { Some(i) => { final_stack.push(i); last_s_failed = 0; println!("Something in the stack! :)"); } None => { println!("Nothing in the stack :("); last_s_failed += 1; } } // Check if we finished if last_q_failed > 1 && last_s_failed > 1 { break; } else if last_q_failed > 0 || last_s_failed > 0 { thread::sleep(Duration::from_millis(100)); } } in_handle.join().unwrap(); println!("Queue: {:?}", final_queue); println!("Stack: {:?}", final_stack); } As you can see in the code, we have two shared variables: a queue and a stack. The secondary thread will push new values to each of them, in the same order, from 0 to 4. Then, the main thread will try to get them back. It will loop indefinitely and use the try_pop() method. The pop() method can be used, but it will block the thread if the queue or the stack is empty. This will happen in any case once all values get popped since no new values are being added, so the try_pop() method will help not to block the main thread and end gracefully. The way it checks whether all the values were popped is by counting how many times it failed to pop a new value. Every time it fails, it will wait for 100 milliseconds, while the push thread only waits for 50 milliseconds between pushes. This means that if it tries to pop new values two times and there are no new values, the pusher thread has already finished. It will add values as they are popped to two vectors and then print the result. In the meantime, it will print messages about pushing and popping new values. You will understand this better by seeing the output: Note that the output can be different in your case, since threads don't need to be executed in any particular order. In this example output, as you can see, it first tries to get something from the queue and the stack but there is nothing there, so it sleeps. The second thread then starts pushing things, two numbers actually. After this, the queue and the stack will be [0, 1]. Then, it pops the first item from each of them. From the queue, it will pop the 0 and from the stack it will pop the 1 (the last one), leaving the queue as [1] and the stack as [0]. It will go back to sleep and the secondary thread will insert a 2 in each variable, leaving the queue as [1, 2] and the stack as [0, 2]. Then, the main thread will pop two elements from each of them. From the queue, it will pop the 1 and the 2, while from the stack it will pop the 2 and then the 0, leaving both empty. The main thread then goes to sleep, and for the next two tries, the secondary thread will push one element and the main thread will pop it, twice. It might seem a little bit complex, but the idea is that these queues and stacks can be used efficiently between threads without requiring a Mutex, and they accept any Send type. This means that they are great for complex computations, and even for multi-staged complex computations. The Crossbeam crate also has some helpers to deal with epochs and even some variants of the mentioned types. For multithreading, Crossbeam also adds a great utility: scoped threads. Scoped threads In all our examples, we have used standard library threads. As we have discussed, these threads have their own stack, so if we want to use variables that we created in the main thread we will need to send them to the thread. This means that we will need to use things such as Arc to share non-mutable data. Not only that, having their own stack means that they will also consume more memory and eventually make the system slower if they use too much. Crossbeam gives us some special threads that allow sharing stacks between them. They are called scoped threads. Using them is pretty simple and the crate documentation explains them perfectly; you will just need to create a Scope by calling crossbeam::scope(). You will need to pass a closure that receives the Scope. You can then call spawn() in that scope the same way you would do it in std::thread, but with one difference, you can share immutable variables among threads if they were created inside the scope or moved to it. This means that for the queues or stacks we just talked about, or for atomic data, you can simply call their methods without requiring an Arc! This will improve the performance even further. Let's see how it works with a simple example: extern crate crossbeam; fn main() { let all_nums: Vec<_> = (0..1_000_u64).into_iter().collect(); let mut results = Vec::new(); crossbeam::scope(|scope| { for num in &all_nums { results.push(scope.spawn(move || num * num + num * 5 + 250)); } }); let final_result: u64 = results.into_iter().map(|res| res.join()).sum(); println!("Final result: {}", final_result); } Let's see what this code does. It will first just create a vector with all the numbers from 0 to 1000. Then, for each of them, in a crossbeam scope, it will run one scoped thread per number and perform a supposedly complex computation. This is just an example, since it will just return a result of a simple second-order function. Interestingly enough, though, the scope.spawn() method allows returning a result of any type, which is great in our case. The code will add each result to a vector. This won't directly add the resulting number, since it will be executed in parallel. It will add a result guard, which we will be able to check outside the scope. Then, after all the threads run and return the results, the scope will end. We can now check all the results, which are guaranteed to be ready for us. For each of them, we just need to call join() and we will get the result. Then, we sum it up to check that they are actual results from the computation. This join() method can also be called inside the scope and get the results, but it will mean that if you do it inside the for loop, for example, you will block the loop until the result is generated, which is not efficient. The best thing is to at least run all the computations first and then start checking the results. If you want to perform more computations after them, you might find it useful to run the new computation in another loop or iterator inside the crossbeam scope. But, how does crossbeam allow you to use the variables outside the scope freely? Won't there be data races? Here is where the magic happens. The scope will join all the inner threads before exiting, which means that no further code will be executed in the main thread until all the scoped threads finish. This means that we can use the variables of the main thread, also called parent stack, due to the main thread being the parent of the scope in this case without any issue. We can actually check what is happening by using the println!() macro. If we remember from previous examples, printing to the console after spawning some threads would usually run even before the spawned threads, due to the time it takes to set them up. In this case, since we have crossbeam preventing it, we won't see it. Let's check the example: extern crate crossbeam; fn main() { let all_nums: Vec<_> = (0..10).into_iter().collect(); crossbeam::scope(|scope| { for num in all_nums { scope.spawn(move || { println!("Next number is {}", num); }); } }); println!("Main thread continues :)"); } If you run this code, you will see something similar to the following output: As you can see, scoped threads will run without any particular order. In this case, it will first run the 1, then the 0, then the 2, and so on. Your output will probably be different. The interesting thing, though, is that the main thread won't continue executing until all the threads have finished. Therefore, reading and modifying variables in the main thread is perfectly safe. There are two main performance advantages with this approach; Arc will require a call to malloc() to allocate memory in the heap, which will take time if it's a big structure and the memory is a bit full. Interestingly enough, that data is already in our stack, so if possible, we should try to avoid duplicating it in the heap. Moreover, the Arc will have a reference counter, as we saw. And it will even be an atomic reference counter, which means that every time we clone the reference, we will need to atomically increment the count. This takes time, even more than incrementing simple integers. Most of the time, we might be waiting for some expensive computations to run, and it would be great if they just gave all the results when finished. We can still add some more chained computations, using scoped threads, that will only be executed after the first ones finish, so we should use scoped threads more often than normal threads, if possible. Using thread pool So far, we have seen multiple ways of creating new threads and sharing information between them. Nevertheless, the ideal number of threads we should spawn to do all the work should be around the number of virtual processors in the system. This means we should not spawn one thread for each chunk of work. Nevertheless, controlling what work each thread does can be complex, since you have to make sure that all threads have work to do at any given point in time. Here is where thread pooling comes in handy. The Threadpool crate will enable you to iterate over all your work and for each of your small chunks, you can call something similar to a thread::spawn(). The interesting thing is that each task will be assigned to an idle thread, and no new thread will be created for each task. The number of threads is configurable and you can get the number of CPUs with other crates. Not only that, if one of the threads panics, it will automatically add a new one to the pool. To see an example, first, let's add threadpool and num_cpus as dependencies in our Cargo.toml file.  Then, let's see an example code: extern crate num_cpus; extern crate threadpool; use std::sync::atomic::{AtomicUsize, Ordering}; use std::sync::Arc; use threadpool::ThreadPool; fn main() { let pool = ThreadPool::with_name("my worker".to_owned(), num_cpus::get()); println!("Pool threads: {}", pool.max_count()); let result = Arc::new(AtomicUsize::new(0)); for i in 0..1_0000_000 { let t_result = result.clone(); pool.execute(move || { t_result.fetch_add(i, Ordering::Relaxed); }); } pool.join(); let final_res = Arc::try_unwrap(result).unwrap().into_inner(); println!("Final result: {}", final_res); } This code will create a thread pool of threads with the number of logical CPUs in your computer. Then, it will add a number from 0 to 1,000,000 to an atomic usize, just to test parallel processing. Each addition will be performed by one thread. Doing this with one thread per operation (1,000,000 threads) would be really inefficient. In this case, though, it will use the appropriate number of threads, and the execution will be really fast. There is another crate that gives thread pools an even more interesting parallel processing feature: Rayon. Using parallel iterators If you can see the big picture in these code examples, you'll have realized that most of the parallel work has a long loop, giving work to different threads. It happened with simple threads and it happens even more with scoped threads and thread pools. It's usually the case in real life, too. You might have a bunch of data to process, and you can probably separate that processing into chunks, iterate over them, and hand them over to various threads to do the work for you. The main issue with that approach is that if you need to use multiple stages to process a given piece of data, you might end up with lots of boilerplate code that can make it difficult to maintain. Not only that, you might find yourself not using parallel processing sometimes due to the hassle of having to write all that code. Luckily, Rayon has multiple data parallelism primitives around iterators that you can use to parallelize any iterative computation. You can almost forget about the Iterator trait and use Rayon's ParallelIterator alternative, which is as easy to use as the standard library trait! Rayon uses a parallel iteration technique called work stealing. For each iteration of the parallel iterator, the new value or values get added to a queue of pending work. Then, when a thread finishes its work, it checks whether there is any pending work to do and if there is, it starts processing it. This, in most languages, is a clear source of data races, but thanks to Rust, this is no longer an issue, and your algorithms can run extremely fast and in parallel. Let's look at how to use it for an example similar to those we have seen in this chapter. First, add rayon to your Cargo.toml file and then let's start with the code: extern crate rayon; use rayon::prelude::*; fn main() { let result = (0..1_000_000_u64) .into_par_iter() .map(|e| e * 2) .sum::<u64>(); println!("Result: {}", result); } As you can see, this works just as you would write it in a sequential iterator, yet, it's running in parallel. Of course, running this example sequentially will be faster than running it in parallel thanks to compiler optimizations, but when you need to process data from files, for example, or perform very complex mathematical computations, parallelizing the input can give great performance gains. Rayon implements these parallel iteration traits to all standard library iterators and ranges. Not only that, it can also work with standard library collections, such as HashMap and Vec. In most cases, if you are using the iter() or into_iter() methods from the standard library in your code, you can simply use par_iter() or into_par_iter() in those calls and your code should now be parallel and work perfectly. But, beware, sometimes parallelizing something doesn't automatically improve its performance. Take into account that if you need to update some shared information between the threads, they will need to synchronize somehow, and you will lose performance. Therefore, multithreading is only great if workloads are completely independent and you can execute one without any dependency on the rest. If you found this article useful and would like to learn more such tips, head over to pick up this book, Rust High Performance, authored by Iban Eguia Moraza. Rust 1.28 is here with global allocators, nonZero types and more Java Multithreading: How to synchronize threads to implement critical sections and avoid race conditions Multithreading with Qt
Read more
  • 0
  • 0
  • 10133

article-image-understanding-functional-reactive-programming-in-scala
Fatema Patrawala
15 Aug 2018
6 min read
Save for later

Understanding functional reactive programming in Scala [Tutorial]

Fatema Patrawala
15 Aug 2018
6 min read
Like OOP (Object-Oriented Programming), Functional Programming is a kind of programming paradigm. It is a programming style in which we write programs in terms of pure functions and immutable data. It treats its programs as function evaluation. As we use pure functions and immutable data to write our applications, we will get lots of benefits for free. For instance, with immutable data, we do not need to worry about shared-mutable states, side effects, and thread-safety. It follows a Declarative programming style, which means programming is done in terms of expressions, not statements. For instance, in OOP or imperative programming paradigms, we use statements to write programs where FP uses everything as expressions. In this scala functional programming tutorial we will understand the principles and benefits of FP and why Functional reactive programming is a best fit for Reactive programming in Scala. This Scala tutorial is an extract taken from the book Scala Reactive Programming written by Rambabu Posa.  Principles of functional programming FP has the following principles: Pure functions Immutable data No side effects Referential transparency (RT) Functions are first-class citizens Functions that include anonymous functions, higher order functions, combinators, partial functions, partially-applied functions, function currying, closures Tail recursion Functions composability A pure function is a function that always returns the same results for the same inputs irrespective of how many times and where you run this function. We will get lots of benefits with immutable data. For instance, no shared data, no side effects, thread safety for free, and so on. Like an object is a first-class citizen in OOP, in FP, a function is a first-class citizen. This means that we can use a function as any of these: An object A value A data A data type An operation In simple words, in FP, we treat both functions and data as the same. We can compose functions that are in sequential order so that we can solve even complex problems easily. Higher-Order Functions (HOF) are functions that take one or more functions as their parameters or return a function as their result or do both. For instance, map(), flatMap(), filter(), and so on are some of the important and frequently used higher-order functions. Consider the following example: map(x => x*x) Here, the map() function is an example of Higher-Order Function because it takes an anonymous function as its parameter. This anonymous function x => x *x is of type Int => Int, which takes an Int as input and returns Int as its result. An anonymous function is a function without any name. Benefits of functional programming FP provides us with many benefits: Thread-safe code Easy-to-write concurrency and parallel code We can write simple, readable, and elegant code Type safety Composability Supports Declarative programming As we use pure functions and immutability in FP, we will get thread-safety for free. One of the greatest benefits of FP is function composability. We can compose multiple functions one by one and execute them either sequentially or parentally. It gives us a great approach to solve complex problems easily. Functional Reactive programming The combination of FP and RP is known as function Reactive programming or, for short, FRP. It is a multiparadigm and combines the benefits and best features of two of the most popular programming paradigms, which are, FP and RP. FRP is a new programming paradigm or a new style of programming that uses the RP paradigm to support asynchronous non-blocking data streaming with backpressure and also uses the FP paradigm to utilize its features (such as pure functions, immutability, no side effects, RT, and more) and its HOF or combinators (such as map, flatMap, filter, reduce, fold, and zip). In simple words, FRP is a new programming paradigm to support RP using FP features and its building blocks. FRP = FP + RP, as shown here: Today, we have many FRP solutions, frameworks, tools, or technologies. Here's a list of a few FRP technologies: Scala, Play Framework, and Akka Toolkit RxJS Reactive-banana Reactive Sodium Haskell This book is dedicated toward discussing Lightbend's FRP technology stack—Lagom Framework, Scala, Play Framework, and Akka Toolkit (Akka Streams). FRP technologies are mainly useful in developing interactive programs, such as rich GUI (graphical user interfaces), animations, multiplayer games, computer music, or robot controllers. Types of Reactive Programming Even though most of the projects or companies use FP Paradigm to develop their Reactive systems or solutions, there are a couple of ways to use RP. They are known as types of RP: FRP (Functional Reactive Programming) OORP (Object-Oriented Reactive Programming) However, FP is the best programming paradigm to conflate with RP. We will get all the benefits of FP for free. Why FP is the best fit for RP When we conflate RP with FP, we will get the following benefits: Composability—we can compose multiple data streams using functional operations so that we can solve even complex problems easily Thread safety Readability Simple, concise, clear, and easy-to-understand code Easy-to-write asynchronous, concurrent, and parallel code Supports very flexible and easy-to-use operations Supports Declarative programming Easy to write, more Scalable, highly available, and robust code In FP, we concentrate on what to do to fulfill a job, whereas in other programming paradigms, such as OOP or imperative programming (IP), we concentrate on how to do. Declarative programming gives us the following benefits: No side effects Enforces to use immutability Easy to write concise and understandable code The main property of RP is real-time data streaming, and the main property of FP is composability. If we combine these two paradigms, we will get more benefits and can develop better solutions easily. In RP, everything is a stream, while everything is a function in FP. We can use these functions to perform operations on data streams. We learnt the principles and benefits of Scala functional programming. To build fault-tolerant, robust, and distributed applications in Scala, grab the book Scala Reactive Programming today. Introduction to the Functional Programming Manipulating functions in functional programming Why functional programming in Python matters: Interview with best selling author, Steven Lott
Read more
  • 0
  • 0
  • 4387

article-image-mongodb-sharding-clusters-choosing-right-shard-key
Fatema Patrawala
14 Aug 2018
9 min read
Save for later

MongoDB Sharding: Sharding clusters and choosing the right shard key [Tutorial]

Fatema Patrawala
14 Aug 2018
9 min read
Sharding was one of the features that MongoDB offered from an early stage, since version 1.6 was released in August 2010. Sharding is the ability to horizontally scale out our database by partitioning our datasets across different servers—the shards. Foursquare and Bitly are two of the most famous early customers for MongoDB that were also using sharding from its inception all the way to the general availability release. In this article we will learn how to design a sharding cluster and how to make the single most important decision around it of choosing the unique shard key. This article is a MongoDB shard tutorial taken from the book Mastering MongoDB 3.x by Alex Giamas. Sharding setup in MongoDB Sharding is performed at the collection level. We can have collections that we don't want or need to shard for several reasons. We can leave these collections unsharded. These collections will be stored in the primary shard. The primary shard is different for each database in MongoDB. The primary shard is automatically selected by MongoDB when we create a new database in a sharded environment. MongoDB will pick the shard that has the least data stored at the moment of creation. If we want to change the primary shard at any other point, we can issue the following command: > db.runCommand( { movePrimary : "mongo_books", to : "UK_based" } ) We thus move the database named mongo_books to the shard named UK_based. Choosing the shard key Choosing our shard key is the most important decision we need to make. The reason is that once we shard our data and deploy our cluster, it becomes very difficult to change the shard key. First, we will go through the process of changing the shard key. Changing the shard key There is no command or simple procedure to change the shard key in MongoDB. The only way to change the shard key involves backing up and restoring all of our data, something that may range from being extremely difficult to impossible in high-load production environments. The steps if we want to change our shard key are as follows: Export all data from MongoDB. Drop the original sharded collection. Configure sharding with the new key. Presplit the new shard key range. Restore our data back into MongoDB. From these steps, step 4 is the one that needs some more explanation. MongoDB uses chunks to split data in a sharded collection. If we bootstrap a MongoDB sharded cluster from scratch, chunks will be calculated automatically by MongoDB. MongoDB will then distribute the chunks across different shards to ensure that there are an equal number of chunks in each shard. The only case in which we cannot really do this is when we want to load data into a newly sharded collection. The reasons are threefold: MongoDB creates splits only after an insert operation. Chunk migration will copy all of the data in that chunk from one shard to another. The floor(n/2) chunk migrations can happen at any given time, where n is the number of shards we have. Even with three shards, this is only a floor(1.5)=1 chunk migration at a time. These three limitations combined mean that letting MongoDB to figure it out on its own will definitely take much longer and may result in an eventual failure. This is why we want to presplit our data and give MongoDB some guidance on where our chunks should go. Considering our example of the mongo_books database and the books collection, this would be: > db.runCommand( { split : "mongo_books.books", middle : { id : 50 } } ) The middle command parameter will split our key space in documents that have id<=50 and documents that have id>50. There is no need for a document to exist in our collection with id=50 as this will only serve as the guidance value for our partitions. In this example, we chose 50 assuming that our keys follow a uniform distribution (that is, the same count of keys for each value) in the range of values from 0 to 100. We should aim to create at least 20-30 chunks to grant MongoDB flexibility in potential migrations. We can also use bounds and find instead of middle if we want to manually define the partition key, but both parameters need data to exist in our collection before applying them. Choosing the correct shard key After the previous section, it's now self-evident that we need to take into great consideration the choice of our shard key as it is something that we have to stick with. A great shard key has three characteristics: High cardinality Low frequency Non-monotonically changing in value We will go over the definitions of these three properties first to understand what they mean. High cardinality means that the shard key must have as many distinct values as possible. A Boolean can take only values of true/false, and so it is a bad shard key choice. A 64-bit long value field that can take any value from −(2^63) to 2^63 − 1 and is a good example in terms of cardinality. Low frequency directly relates to the argument about high cardinality. A low-frequency shard key will have a distribution of values as close to a perfectly random / uniform distribution. Using the example of our 64-bit long value, it is of little use to us if we have a field that can take values ranging from −(2^63) to 2^63 − 1 only to end up observing the values of 0 and 1 all the time. In fact, it is as bad as using a Boolean field, which can also take only two values after all. If we have a shard key with high frequency values, we will end up with chunks that are indivisible. These chunks cannot be further divided and will grow in size, negatively affecting the performance of the shard that contains them. Non-monotonically changing values mean that our shard key should not be, for example, an integer that always increases with every new insert. If we choose a monotonically increasing value as our shard key, this will result in all writes ending up in the last of all of our shards, limiting our write performance. If we want to use a monotonically changing value as the shard key, we should consider using hash-based sharding. In the next section, we will describe different sharding strategies and their advantages and disadvantages. Range-based sharding The default and the most widely used sharding strategy is range-based sharding. This strategy will split our collection's data into chunks, grouping documents with nearby values in the same shard. For our example database and collection, mongo_books and books respectively, we have: > sh.shardCollection("mongo_books.books", { id: 1 } ) This creates a range-based shard key on id with ascending direction. The direction of our shard key will determine which documents will end up in the first shard and which ones in the subsequent ones. This is a good strategy if we plan to have range-based queries as these will be directed to the shard that holds the result set instead of having to query all shards. Hash-based sharding If we don't have a shard key (or can't create one) that achieves the three goals mentioned previously, we can use the alternative strategy of using hash-based sharding. In this case, we are trading data distribution with query isolation. Hash-based sharding will take the values of our shard key and hash them in a way that guarantees close to uniform distribution. This way we can be sure that our data will evenly distribute across shards. The downside is that only exact match queries will get routed to the exact shard that holds the value. Any range query will have to go out and fetch data from all shards. For our example database and collection (mongo_books and books respectively), we have: > sh.shardCollection("mongo_books.books", { id: "hashed" } ) Similar to the preceding example, we are now using the id field as our hashed shard key. Suppose we use fields with float values for hash-based sharding. Then we will end up with collisions if the precision of our floats is more that 2^53. These fields should be avoided where possible. Coming up with our own key Range-based sharding does not need to be confined to a single key. In fact, in most cases, we would like to combine multiple keys to achieve high cardinality and low frequency. A common pattern is to combine a low-cardinality first part (but still having as distinct values more than two times the number of shards that we have) with a high-cardinality key as its second field. This achieves both read and write distribution from the first part of the sharding key and then cardinality and read locality from the second part. On the other hand, if we don't have range queries, we can get away by using hash-based sharding on a primary key as this will exactly target the shard and document that we are going after. To make things more complicated, these considerations may change depending on our workload. A workload that consists almost exclusively (say 99.5%) of reads won't care about write distribution. We can use the built-in _id field as our shard key and this will only add 0.5% load in the last shard. Our reads will still be distributed across shards. Unfortunately, in most cases, this is not simple. Location-based data Due to government regulations and the desire to have our data as close to our users as possible, there is often a constraint and need to limit data in a specific data center. By placing different shards at different data centers, we can satisfy this requirement. To summarize we learned about MongoDB sharding and got to know techniques to choose the correct shard key. Get the expert guide Mastering MongoDB 3.x  today to build fault-tolerant MongoDB application. MongoDB 4.0 now generally available with support for multi-platform, mobile, ACID transactions and more MongoDB going relational with 4.0 release Indexing, Replicating, and Sharding in MongoDB [Tutorial]
Read more
  • 0
  • 0
  • 10848

article-image-cloud-native-architectures-microservices-containers-serverless-part-2
Guest Contributor
14 Aug 2018
8 min read
Save for later

Modern Cloud Native architectures: Microservices, Containers, and Serverless - Part 2

Guest Contributor
14 Aug 2018
8 min read
This whitepaper is written by Mina Andrawos, an experienced engineer who has developed deep experience in the Go language, and modern software architectures. He regularly writes articles and tutorials about the Go language, and also shares open source projects. Mina Andrawos has authored the book Cloud Native programming with Golang, which provides practical techniques, code examples, and architectural patterns required to build cloud native microservices in the Go language.He is also the author of the Mastering Go Programming, and the Modern Golang Programming video courses. We published Part 1 of this paper yesterday and here we come up with Part 2 which involves Containers and Serverless applications. Let us get started: Containers The technology of software containers is the next key technology that needs to be discussed to practically explain cloud native applications. A container is simply the idea of encapsulating some software inside an isolated user space or “container.” For example, a MySQL database can be isolated inside a container where the environmental variables, and the configurations that it needs will live. Software outside the container will not see the environmental variables or configuration contained inside the container by default. Multiple containers can exist on the same local virtual machine, cloud virtual machine, or hardware server. Containers provide the ability to run numerous isolated software services, with all their configurations, software dependencies, runtimes, tools, and accompanying files, on the same machine. In a cloud environment, this ability translates into saved costs and efforts, as the need for provisioning and buying server nodes for each microservices will diminish, since different microservices can be deployed on the same host without disrupting each other. Containers  combined with microservices architectures are powerful tools to build modern, portable, scalable, and cost efficient software. In a production environment, more than a single server node combined with numerous containers would be needed to achieve scalability and redundancy. Containers also add more benefits to cloud native applications beyond microservices isolation. With a container, you can move your microservices, with all the configuration, dependencies, and environmental variables that it needs, to fresh server nodes without the need to reconfigure the environment, achieving powerful portability. Due to the power and popularity of the software containers technology, some new operating systems like CoreOS, or Photon OS, are built from the ground up to function as hosts for containers. One of the most popular software container projects in the software industry is Docker. Major organizations such as Cisco, Google, and IBM utilize Docker containers in their infrastructure as well as in their products. Another notable project in the software containers world is Kubernetes. Kubernetes is a tool that allows the automation of deployment, management, and scaling of containers. It was built by Google to facilitate the management of their containers, which are counted by billions per week. Kubernetes provides some powerful features such as load balancing between containers, restart for failed containers, and orchestration of storage utilized by the containers. The project is part of the cloud native foundation along with Prometheus. Container complexities In case of containers, sometimes the task of managing them can get rather complex for the same reasons as managing expanding numbers of microservices. As containers or microservices grow in size, there needs to be a mechanism to identify where each container or microservices is deployed, what their purpose is, and what they need in resources to keep running. Serverless applications Serverless architecture is a new software architectural paradigm that was popularized with the AWS Lambda service. In order to fully understand serverless applications, we must first cover an important concept known as ‘Function As A service’, or FaaS for short. Function as a service or FaaS is the idea that a cloud provider such as Amazon or even a local piece of software such as Fission.io or funktion would provide a service, where a user can request a function to run remotely in order to perform a very specific task, and then after the function concludes, the function results return back to the user. No services or stateful data are maintained and the function code is provided by the user to the service that runs the function. The idea behind properly designed cloud native production applications that utilize the serverless architecture is that instead of building multiple microservices expected to run continuously in order to carry out individual tasks, build an application that has fewer microservices combined with FaaS, where FaaS covers tasks that don’t need services to run continuously. FaaS is a smaller construct than a microservice. For example, in case of the event booking application we covered earlier, there were multiple microservices covering different tasks. If we use a serverless applications model, some of those microservices would be replaced with a number of functions that serve their purpose. Here is a diagram that showcases the application utilizing a serverless architecture: In this diagram, the event handler microservices as well as the booking handler microservices were replaced with a number of functions that produce the same functionality. This eliminates the need to run and maintain the two existing microservices. Serverless architectures have the advantage that no virtual machines and/or containers need to be provisioned to build the part of the application that utilizes FaaS. The computing instances that run the functions cease to exist from the user point of view once their functions conclude. Furthermore, the number of microservices and/or containers that need to be monitored and maintained by the user decreases, saving cost, time, and effort. Serverless architectures provide yet another powerful software building tool in the hands of software engineers and architects to design flexible and scalable software. Known FaaS are AWS Lambda by Amazon, Azure Functions by Microsoft, Cloud Functions by Google, and many more. Another definition for serverless applications is the applications that utilize the BaaS or backend as a service paradigm. BaaS is the idea that developers only write the client code of their application, which then relies on several software pre-built services hosted in the cloud, accessible via APIs. BaaS is popular in mobile app programming, where developers would rely on a number of backend services to drive the majority of the functionality of the application. Examples of BaaS services are: Firebase, and Parse. Disadvantages of serverless applications Similarly to microservices and cloud native applications, the serverless architecture is not suitable for all scenarios. The functions provided by FaaS don’t keep state by themselves which means special considerations need to be observed when writing the function code. This is unlike a full microservice, where the developer has full control over the state. One approach to keep state in case of FaaS, in spite of this limitation, is to propagate the state to a database or a memory cache like Redis. The startup times for the functions are not always fast since there is time allocated to sending the request to the FaaS service provider then the time needed to start a computing instance that runs the function in some cases. These delays have to be accounted for when designing serverless applications. FaaS do not run continuously like microservices, which makes them unsuitable for any task that requires continuous running of the software. Serverless applications have the same limitation as other cloud native applications where portability of the application from one cloud provider to another or from the cloud to a local environment becomes challenging because of vendor lock-in Conclusion Cloud computing architectures have opened avenues for developing efficient, scalable, and reliable software. This paper covered some significant concepts in the world of cloud computing such as microservices, cloud native applications, containers, and serverless applications. Microservices are the building blocks for most scalable cloud native applications; they decouple the application tasks into various efficient services. Containers are how microservices could be isolated and deployed safely to production environments without polluting them.  Serverless applications decouple application tasks into smaller constructs mostly called functions that can be consumed via APIs. Cloud native applications make use of all those architectural patterns to build scalable, reliable, and always available software. You read Part 2 of of Modern cloud native architectures, a white paper by Mina Andrawos. Also read Part 1 which includes Microservices and Cloud native applications with their advantages and disadvantages. If you are interested to learn more, check out Mina’s Cloud Native programming with Golang to explore practical techniques for building cloud-native apps that are scalable, reliable, and always available. About Author: Mina Andrawos Mina Andrawos is an experienced engineer who has developed deep experience in Go from using it personally and professionally. He regularly authors articles and tutorials about the language, and also shares Go's open source projects. He has written numerous Go applications with varying degrees of complexity. Other than Go, he has skills in Java, C#, Python, and C++. He has worked with various databases and software architectures. He is also skilled with the agile methodology for software development. Besides software development, he has working experience of scrum mastering, sales engineering, and software product management. Build Java EE containers using Docker [Tutorial] Are containers the end of virtual machines? Why containers are driving DevOps
Read more
  • 0
  • 0
  • 3729

article-image-application-data-entity-framework-net-core
Aaron Lazar
14 Aug 2018
14 min read
Save for later

Access application data with Entity Framework in .NET Core [Tutorial]

Aaron Lazar
14 Aug 2018
14 min read
In this tutorial, we will get started with using the Entity Framework and create a simple console application to perform CRUD operations. The intent is to get started with EF Core and understand how to use it. Before we dive into coding, let us see the two development approaches that EF Core supports: Code-first Database-first These two paradigms have been supported for a very long time and therefore we will just look at them at a very high level. EF Core mainly targets the code-first approach and has limited support for the database-first approach, as there is no support for the visual designer or wizard for the database model out of the box. However, there are third-party tools and extensions that support this. The list of third-party tools and extensions can be seen at https://docs.microsoft.com/en-us/ef/core/extensions/. This tutorial has been extracted from the book .NET Core 2.0 By Example, by Rishabh Verma and Neha Shrivastava. In the code-first approach, we first write the code; that is, we first create the domain model classes and then, using these classes, EF Core APIs create the database and tables, using migration based on the convention and configuration provided. We will look at conventions and configurations a little later in this section. The following diagram illustrates the code-first approach: In the database-first approach, as the name suggests, we have an existing database or we create a database first and then use EF Core APIs to create the domain and context classes. As mentioned, currently EF Core has limited support for it due to a lack of tooling. So, our preference will be for the code-first approach throughout our examples. The reader can discover the third-party tools mentioned previously to learn more about the EF Core database-first approach as well. The following image illustrates the database-first approach: Building Entity Framework Core Console App Now that we understand the approaches and know that we will be using the code-first approach, let's dive into coding our getting started with EF Core console app. Before we do so, we need to have SQL Express installed in our development machine. If SQL Express is not installed, download the SQL Express 2017 edition from https://www.microsoft.com/en-IN/sql-server/sql-server-downloads and run the setup wizard. We will do the Basic installation of SQL Express 2017 for our learning purposes, as shown in the following screenshot: Our objective is to learn how to use EF Core and so we will not do anything fancy in our console app. We will just do simple Create Read Update Delete (CRUD) operations of a simple class called Person, as defined here: public class Person { public int Id { get; set; } public string Name { get; set; } public bool Gender { get; set; } public DateTime DateOfBirth { get; set; } public int Age { get { var age = DateTime.Now.Year - this.DateOfBirth.Year; if (DateTime.Now.DayOfYear < this.DateOfBirth.DayOfYear) { age = age - 1; } return age; } } } As we can see in the preceding code, the class has simple properties. To perform the CRUD operations on this class, let's create a console app by performing the following steps: Create a new .NET Core console project named GettingStartedWithEFCore, as shown in the following screenshot: Create a new folder named Models in the project node and add the Person class to this newly created folder. This will be our model entity class, which we will use for CRUD operations. Next, we need to install the EF Core package. Before we do that, it's important to know that EF Core provides support for a variety of databases. A few of the important ones are: SQL Server SQLite InMemory (for testing) The complete and comprehensive list can be seen at https://docs.microsoft.com/en-us/ef/core/providers/. We will be working with SQL Server on Windows for our learning purposes, so let's install the SQL Server package for Entity Framework Core. To do so, let's install the Microsoft.EntityFrameworkCore.SqlServer package from the NuGet Package Manager in Visual Studio 2017. Right-click on the project. Select Manage Nuget Packages and then search for Microsoft.EntityFrameworkCore.SqlServer. Select the matching result and click Install: Next, we will create a class called Context, as shown here: public class Context : DbContext { public DbSet<Person&gt; Persons { get; set; } protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder) { //// Get the connection string from configuration optionsBuilder.UseSqlServer(@"Server=.\SQLEXPRESS ;Database=PersonDatabase;Trusted_Connection=True;"); } protected override void OnModelCreating(ModelBuilder modelBuilder) { modelBuilder.Entity<Person> ().Property(nameof(Person.Name)).IsRequired(); } } The class looks quite simple, but it has the following subtle and important things to make note of: The Context class derives from DbContext, which resides in the Microsoft.EntityFrameworkCore namespace. DbContext is an integral part of EF Core and if you have worked with EF, you will already be aware of it. An instance of DbContext represents a session with the database and can be used to query and save instances of your entities. DbContext is a combination of the Unit Of Work and Repository Patterns. Typically, you create a class that derives from DbContext and contains Microsoft.EntityFrameworkCore.DbSet properties for each entity in the model. If properties have a public setter, they are automatically initialized when the instance of the derived context is created. It contains a property named Persons (plural of the model class Person) of type DbSet<Person&gt;. This will map to the Persons table in the underlying database. The class overrides the OnConfiguring method of DbContext and specifies the connection string to be used with the SQL Server database. The connection string should be read from the configuration file, appSettings.json, but for the sake of brevity and simplicity, it's hardcoded in the preceding code. The OnConfiguring method allows us to select and configure the data source to be used with a context using DbContextOptionsBuilder. Let's look at the connection string. Server= specifies the server. It can be .\SQLEXPRESS, .\SQLSERVER, .\LOCALDB, or any other instance name based on the installation you have done. Database= specifies the database name that will be created. Trusted_Connection=True specifies that we are using integrated security or Windows authentication. An enthusiastic reader should read the official Microsoft Entity framework documentation on configuring the context at https://docs.microsoft.com/en-us/ef/core/miscellaneous/configuring-dbcontext.  The OnModelCreating method allows us to configure the model using the ModelBuilder Fluent API. This is the most powerful method of configuration and allows configuration to be specified without modifying the entity classes. The Fluent API configuration has the highest precedence and will override conventions and data annotations. The preceding code has same effect as the following data annotation has on the Name property in the Person class: [Required] public string Name { get; set; } The preceding point highlights the flexibility and configuration that EF Core brings to the table. EF Core uses a combination of conventions, attributes, and Fluent API statements to build a database model at runtime. All we have to do is to perform actions on the model classes using a combination of these and they will automatically be translated to appropriate changes in the database. Before we conclude this point, let's have a quick look at each of the different ways to configure a database model: EF Core conventions: The conventions in EF Core are comprehensive. They are the default rules by which EF Core builds a database model based on classes. A few of the simpler yet important default conventions are listed here: EF Core creates database tables for all DbSet<TEntity&gt; properties in a Context class with the same name as that of the property. In the preceding example, the table name would be Persons based on this convention. EF Core creates tables for entities that are not included as DbSet properties but are reachable through reference properties in the other DbSet entities. If the Person class had a complex/navigation property, EF Core would have created a table for it as well. EF Core creates columns for all the scalar read-write properties of a class with the same name as the property by default. It uses the reference and collection properties for building relationships among corresponding tables in the database. In the preceding example, the scalar properties of Person correspond to a column in the Persons table. EF Core assumes a property named ID or one that is suffixed with ID as a primary key. If the property is an integer type or Guid type, then EF Core also assumes it to be IDENTITY and automatically assigns a value when inserting the data. This is precisely what we will make use of in our example while inserting or creating a new Person. EF Core maps the data type of a database column based on the data type of the property defined in the C# class. A few of the mappings between the C# data type to the SQL Server column data type are listed in the following table: C# data type SQL server data type int int string nvarchar(Max) decimal decimal(18,2) float real byte[] varbinary(Max) datetime datetime bool bit byte tinyint short smallint long bigint double float There are many other conventions, and we can define custom conventions as well. For more details, please read the official Microsoft documentation at https://docs.microsoft.com/en-us/ef/core/modeling/. Attributes: Conventions are often not enough to map the class to database objects. In such scenarios, we can use attributes called data annotation attributes to get the desired results. The [Required] attribute that we have just seen is an example of a data annotation attribute. Fluent API: This is the most powerful way of configuring the model and can be used in addition to or in place of attributes. The code written in the OnModelConfiguring method is an example of a Fluent API statement. If we check now, there is no PersonDatabase database. So, we need to create the database from the model by adding a migration. EF Core includes different migration commands to create or update the database based on the model. To do so in Visual Studio 2017, go to Tools | Nuget Package Manager | Package Manager Console, as shown in the following screenshot: This will open the Package Manager Console window. Select the Default Project as GettingStartedWithEFCore and type the following command: add-migration CreatePersonDatabase If you are not using Visual Studio 2017 and you are dependent on .NET Core CLI tooling, you can use the following command: dotnet ef migrations add CreatePersonDatabase We have not installed the Microsoft.EntityFrameworkCore.Design package, so it will give an error: Your startup project 'GettingStartedWithEFCore' doesn't reference Microsoft.EntityFrameworkCore.Design. This package is required for the Entity Framework Core Tools to work. Ensure your startup project is correct, install the package, and try again. So let's first go to the NuGet Package Manager and install this package. After successful installation of this package, if we run the preceding command again, we should be able to run the migrations successfully. It will also tell us the command to undo the migration by displaying the message To undo this action, use Remove-Migration. We should see the new files added in the Solution Explorer in the Migrations folder, as shown in the following screenshot: 8. Although we have migrations applied, we have still not created a database. To create the database, we need to run the following commands. In Visual Studio 2017: update-database –verbose In .NET Core CLI: dotnet ef database update If all goes well, we should have the database created with the Persons table (property of type DbSet<Person&gt;) in the database. Let's validate the table and database by using SQL Server Management Studio (SSMS). If SSMS is not installed in your machine, you can also use Visual Studio 2017 to view the database and table. Let's check the created database. In Visual Studio 2017, click on the View menu and select Server Explorer, as shown in the following screenshot: In Server Explorer, right-click on Data Connections and then select Add Connection. The Add Connection dialog will show up. Enter .\SQLEXPRESS in the Server name (since we installed SQL EXPRESS 2017) and select PersonDatabase as the database, as shown in the following screenshot: On clicking OK, we will see the database named PersonDatabase and if we expand the tables, we can see the Persons table as well as the _EFMigrationsHistory table. Notice that the properties in the Person class that had setters are the only properties that get transformed into table columns in the Persons table. Notice that the Age property is read-only in the class we created and therefore we do not see an age column in the database table, as shown in the following screenshot: This is the first migration to create a database. Whenever we add or update the model classes or configurations, we need to sync the database with the model using the add-migration and update-database commands. With this, we have our model class ready and the corresponding database created. The following image summarizes how the properties have been mapped from the C# class to the database table columns: Now, we will use the Context class to perform CRUD operations.  Let's go back to our Main.cs and write the following code. The code is well commented, so please go through the comments to understand the flow: class Program { static void Main(string[] args) { Console.WriteLine("Getting started with EF Core"); Console.WriteLine("We will do CRUD operations on Person class."); //// Lets create an instance of Person class. Person person = new Person() { Name = "Rishabh Verma", Gender = true, //// For demo true= Male, false = Female. Prefer enum in real cases. DateOfBirth = new DateTime(2000, 10, 23) }; using (var context = new Context()) { //// Context has strongly typed property named Persons which referes to Persons table. //// It has methods Add, Find, Update, Remove to perform CRUD among many others. //// Use AddRange to add multiple persons in once. //// Complete set of APIs can be seen by using F12 on the Persons property below in Visual Studio IDE. var personData = context.Persons.Add(person); //// Though we have done Add, nothing has actually happened in database. All changes are in context only. //// We need to call save changes, to persist these changes in the database. context.SaveChanges(); //// Notice above that Id is Primary Key (PK) and hence has not been specified in the person object passed to context. //// So, to know the created Id, we can use the below Id int createdId = personData.Entity.Id; //// If all goes well, person data should be persisted in the database. //// Use proper exception handling to discover unhandled exception if any. Not showing here for simplicity and brevity. createdId variable would now hold the id of created person. //// READ BEGINS Person readData = context.Persons.Where(j => j.Id == createdId).FirstOrDefault(); //// We have the data of person where Id == createdId, i.e. details of Rishabh Verma. //// Lets update the person data all together just for demonstarting update functionality. //// UPDATE BEGINS person.Name = "Neha Shrivastava"; person.Gender = false; person.DateOfBirth = new DateTime(2000, 6, 15); person.Id = createdId; //// For update cases, we need this to be specified. //// Update the person in context. context.Persons.Update(person); //// Save the updates. context.SaveChanges(); //// DELETE the person object. context.Remove(readData); context.SaveChanges(); } Console.WriteLine("All done. Please press Enter key to exit..."); Console.ReadLine(); } } With this, we have completed our sample app to get started with EF Core. I hope this simple example will set you up to start using EF Core with confidence and encourage you to start exploring it further. The detailed features of EF Core can be learned from the official Microsoft documentation available at https://docs.microsoft.com/en-us/ef/core/. If you're interested in learning more, head over to this book, .NET Core 2.0 By Example, by Rishabh Verma and Neha Shrivastava. How to build a chatbot with Microsoft Bot framework Working with Entity Client and Entity SQL Get to know ASP.NET Core Web API [Tutorial]
Read more
  • 0
  • 0
  • 10400
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at $15.99/month. Cancel anytime
article-image-polymorphism-type-pattern-matching-python
Aaron Lazar
13 Aug 2018
11 min read
Save for later

Polymorphism and type-pattern matching in Python [Tutorial]

Aaron Lazar
13 Aug 2018
11 min read
Some functional programming languages offer clever approaches to the problem of working with statically typed function definitions. The problem is that many functions we'd like to write are entirely generic with respect to data type. For example, most of our statistical functions are identical for int or float numbers, as long as the division returns a value that is a subclass of numbers.Real (for example, Decimal, Fraction, or float). In many functional languages, sophisticated type or type-pattern matching rules are used by the compiler to make a single generic definition work for multiple data types. Python doesn't have this problem and doesn't need the pattern matching. In this article, we'll understand how to achieve Polymorphism and type-pattern matching in Python. This Python tutorial is an extract taken from the 2nd edition of the bestseller, Functional Python Programming, authored by Steven Lott. Instead of the (possibly) complex features of statically typed functional languages, Python changes the approach dramatically. Python uses dynamic selection of the final implementation of an operator based on the data types being used. In Python, we always write generic definitions. The code isn't bound to any specific data type. The Python runtime will locate the appropriate operations based on the types of the actual objects in use. The 3.3.7 Coercion rules section of the language reference manual and the numbers module in the library provide details on how this mapping from operation to special method name works. This means that the compiler doesn't certify that our functions are expecting and producing the proper data types. We generally rely on unit testing and the mypy tool for this kind of type checking. In rare cases, we might need to have different behavior based on the types of data elements. We have two ways to tackle this: We can use the isinstance() function to distinguish the different cases We can create our own subclass of numbers.Number or NamedTuple and implement proper polymorphic special method names. In some cases, we'll actually need to do both so that we can include appropriate data type conversions for each operation. Additionally, we'll also need to use the cast() function to make the types explicit to the mypy tool. The ranking example in the previous section is tightly bound to the idea of applying rank-ordering to simple pairs. While this is the way the Spearman correlation is defined, a multivariate dataset have a need to do rank-order correlation among all the variables. The first thing we'll need to do is generalize our idea of rank-order information. The following is a NamedTuple value that handles a tuple of ranks and a raw data object: from typing import NamedTuple, Tuple, Any class Rank_Data(NamedTuple): rank_seq: Tuple[float] raw: Any A typical use of this kind of class definition is shown in this example: >>> data = {'key1': 1, 'key2': 2} >>> r = Rank_Data((2, 7), data) >>> r.rank_seq[0] 2 >>> r.raw {'key1': 1, 'key2': 2} The row of raw data in this example is a dictionary. There are two rankings for this particular item in the overall list. An application can get the sequence of rankings as well as the original raw data item. We'll add some syntactic sugar to our ranking function. In many previous examples, we've required either an iterable or a concrete collection. The for statement is graceful about working with either one. However, we don't always use the for statement, and for some functions, we've had to explicitly use iter() to make an iterable out of a collection. We can handle this situation with a simple isinstance() check, as shown in the following code snippet: def some_function(seq_or_iter: Union[Sequence, Iterator]): if isinstance(seq_or_iter, Sequence): yield from some_function(iter(seq_or_iter), key) return # Do the real work of the function using the Iterator This example includes a type check to handle the small difference between a Sequence object and an Iterator. Specifically, the function uses iter() to create an Iterator from a Sequence, and calls itself recursively with the derived value. For rank-ordering, the Union[Sequence, Iterator] will be supported. Because the source data must be sorted for ranking, it's easier to use list() to transform a given iterator into a concrete sequence. The essential isinstance() check will be used, but instead of creating an iterator from a sequence (as shown previously), the following examples will create a sequence object from an iterator. In the context of our rank-ordering function, we can make the function somewhat more generic. The following two expressions define the inputs: Source = Union[Rank_Data, Any] Union[Sequence[Source], Iterator[Source]] There are four combinations defined by these two types: Sequence[Rank_Data] Sequence[Any] Iterator[Rank_Data] Iterator[Any] Handling four combination data types Here's the rank_data() function with three cases for handling the four combinations of data types: from typing import ( Callable, Sequence, Iterator, Union, Iterable, TypeVar, cast, Union ) K_ = TypeVar("K_") # Some comparable key type used for ranking. Source = Union[Rank_Data, Any] def rank_data( seq_or_iter: Union[Sequence[Source], Iterator[Source]], key: Callable[[Rank_Data], K_] = lambda obj: cast(K_, obj) ) -> Iterable[Rank_Data]: if isinstance(seq_or_iter, Iterator): # Iterator? Materialize a sequence object yield from rank_data(list(seq_or_iter), key) return data: Sequence[Rank_Data] if isinstance(seq_or_iter[0], Rank_Data): # Collection of Rank_Data is what we prefer. data = seq_or_iter else: # Convert to Rank_Data and process. empty_ranks: Tuple[float] = cast(Tuple[float], ()) data = list( Rank_Data(empty_ranks, raw_data) for raw_data in cast(Sequence[Source], seq_or_iter) ) for r, rd in rerank(data, key): new_ranks = cast( Tuple[float], rd.rank_seq + cast(Tuple[float], (r,))) yield Rank_Data(new_ranks, rd.raw) We've decomposed the ranking into three cases to cover the four different types of data. The following are the cases defined by the union of unions: Given an Iterator (an object without a usable __getitem__() method), we'll materialize a list object to work with. This will work for Rank_Data as well as any other raw data type. This case covers objects which are Iterator[Rank_Data] as well as Iterator[Any]. Given a Sequence[Any], we'll wrap the unknown objects into Rank_Data tuples with an empty collection of rankings to create a Sequence[Rank_Data]. Finally, given a Sequence[Rank_Data], add yet another ranking to the tuple of ranks inside the each Rank_Data container. The first case calls rank_data() recursively. The other two cases both rely on a rerank() function that builds a new Rank_Data tuple with additional ranking values. This contains several rankings for a complex record of raw data values. Note that a relatively complex cast() expression is required to disambiguate the use of generic tuples for the rankings. The mypy tool offers a reveal_type() function that can be incorporated to debug the inferred types. The rerank() function follows a slightly different design to the example of the rank() function shown previously. It yields two-tuples with the rank and the original data object: def rerank( rank_data_iter: Iterable[Rank_Data], key: Callable[[Rank_Data], K_] ) -> Iterator[Tuple[float, Rank_Data]]: sorted_iter = iter( sorted( rank_data_iter, key=lambda obj: key(obj.raw) ) ) # Apply ranker to head, *tail = sorted(rank_data_iter) head = next(sorted_iter) yield from ranker(sorted_iter, 0, [head], key) The idea behind rerank() is to sort a collection of Rank_Data objects. The first item, head, is used to provide a seed value to the ranker() function. The ranker() function can examine the remaining items in the iterable to see if they match this initial value, this allows computing a proper rank for a batch of matching items. The ranker() function accepts a sorted iterable of data, a base rank number, and an initial collection of items of the minimum rank. The result is an iterable sequence of two-tuples with a rank number and an associated Rank_Data object: def ranker( sorted_iter: Iterator[Rank_Data], base: float, same_rank_seq: List[Rank_Data], key: Callable[[Rank_Data], K_] ) -> Iterator[Tuple[float, Rank_Data]]: try: value = next(sorted_iter) except StopIteration: dups = len(same_rank_seq) yield from yield_sequence( (base+1+base+dups)/2, iter(same_rank_seq)) return if key(value.raw) == key(same_rank_seq[0].raw): yield from ranker( sorted_iter, base, same_rank_seq+[value], key) else: dups = len(same_rank_seq) yield from yield_sequence( (base+1+base+dups)/2, iter(same_rank_seq)) yield from ranker( sorted_iter, base+dups, [value], key) This starts by attempting to extract the next item from the sorted_iter collection of sorted Rank_Data items. If this fails with a StopIteration exception, there is no next item, the source was exhausted. The final output is the final batch of equal-valued items in the same_rank_seq sequence. If the sequence has a next item, the key() function extracts the key value. If this new value matches the keys in the same_rank_seq collection, it is accumulated into the current batch of same-valued keys.  The final result is based on the rest of the items in sorted_iter, the current value for the rank, a larger batch of same_rank items that now includes the head value, and the original key() function. If the next item's key doesn't match the current batch of equal-valued items, the final result has two parts. The first part is the batch of equal-valued items accumulated in same_rank_seq.  This is followed by the reranking of the remainder of the sorted items. The base value for these is incremented by the number of equal-valued items, a fresh batch of equal-rank items is initialized with the distinct key, and the original key() extraction function is provided. The output from ranker() depends on the yield_sequence() function, which looks as follows: def yield_sequence( rank: float, same_rank_iter: Iterator[Rank_Data] ) -> Iterator[Tuple[float, Rank_Data]]: head = next(same_rank_iter) yield rank, head yield from yield_sequence(rank, same_rank_iter) We've written this in a way that emphasizes the recursive definition. For any practical work, this should be optimized into a single for statement. When doing Tail-Call Optimization to transform a recursion into a loop define unit test cases first. Be sure the recursion passes the unit test cases before optimizing. The following are some examples of using this function to rank (and rerank) data. We'll start with a simple collection of scalar values: >>> scalars= [0.8, 1.2, 1.2, 2.3, 18] >>> list(rank_data(scalars)) [Rank_Data(rank_seq=(1.0,), raw=0.8), Rank_Data(rank_seq=(2.5,), raw=1.2), Rank_Data(rank_seq=(2.5,), raw=1.2), Rank_Data(rank_seq=(4.0,), raw=2.3), Rank_Data(rank_seq=(5.0,), raw=18)] Each value becomes the raw attribute of a Rank_Data object. When we work with a slightly more complex object, we can also have multiple rankings. The following is a sequence of two tuples: >>> pairs = ((2, 0.8), (3, 1.2), (5, 1.2), (7, 2.3), (11, 18)) >>> rank_x = list(rank_data(pairs, key=lambda x:x[0])) >>> rank_x [Rank_Data(rank_seq=(1.0,), raw=(2, 0.8)), Rank_Data(rank_seq=(2.0,), raw=(3, 1.2)), Rank_Data(rank_seq=(3.0,), raw=(5, 1.2)), Rank_Data(rank_seq=(4.0,), raw=(7, 2.3)), Rank_Data(rank_seq=(5.0,), raw=(11, 18))] >>> rank_xy = list(rank_data(rank_x, key=lambda x:x[1] )) >>> rank_xy [Rank_Data(rank_seq=(1.0, 1.0), raw=(2, 0.8)), Rank_Data(rank_seq=(2.0, 2.5), raw=(3, 1.2)), Rank_Data(rank_seq=(3.0, 2.5), raw=(5, 1.2)), Rank_Data(rank_seq=(4.0, 4.0), raw=(7, 2.3)), Rank_Data(rank_seq=(5.0, 5.0), raw=(11, 18))] Here, we defined a collection of pairs. Then, we ranked the two tuples, assigning the sequence of Rank_Data objects to the rank_x variable. We then ranked this collection of Rank_Data objects, creating a second rank value and assigning the result to the rank_xy variable. The resulting sequence can be used for a slightly modified rank_corr() function to compute the rank correlations of any of the available values in the rank_seq attribute of the Rank_Data objects. We'll leave this modification as an exercise for you. If you found this tutorial useful and would like to learn more such techniques, head over to get Steven Lott's bestseller, Functional Python Programming. Why functional programming in Python matters: Interview with best selling author, Steven Lott Top 7 Python programming books you need to read Members Inheritance and Polymorphism
Read more
  • 0
  • 0
  • 6441

article-image-cloud-native-architectures-microservices-containers-serverless-part-1
Guest Contributor
13 Aug 2018
9 min read
Save for later

Modern Cloud Native architectures: Microservices, Containers, and Serverless - Part 1

Guest Contributor
13 Aug 2018
9 min read
This whitepaper is written by Mina Andrawos, an experienced engineer who has developed deep experience in the Go language, and modern software architectures. He regularly writes articles and tutorials about the Go language, and also shares open source projects. Mina Andrawos has authored the book Cloud Native programming with Golang, which provides practical techniques, code examples, and architectural patterns required to build cloud native microservices in the Go language.He is also the author of the Mastering Go Programming, and the Modern Golang Programming video courses. This paper sheds some light and provides practical exposure on some key topics in the modern software industry, namely cloud native applications.This includes microservices, containers , and serverless applications. The paper will cover the practical advantages, and disadvantages of the technologies covered. Microservices The microservices architecture has gained reputation as a powerful approach to architect modern software applications. So what are microservices? Microservices can be described as simply the idea of separating the functionality required from a software application into multiple independent small software services or “microservices.” Each microservice is responsible for an individual focused task. In order for microservices to collaborate together to form a large scalable application, they communicate and exchange data. Microservices were born out of the need to tame the complexity, and inflexibility of “monolithic” applications. A monolithic application is a type of application, where all required functionality is coded together into the same service. For example, here is a diagram representing a monolithic events (like concerts, shows..etc) booking application that takes care of the booking payment processing and event reservation: The application can be used by a customer to book a concert or a show. A user interface will be needed. Furthermore, we will also need a search functionality to look for events, a bookings handler to process the user booking then save it, and an events handler to help find the event, ensure it has seats available, then link it to the booking. In a production level application, more tasks will be needed like payment processing for example, but for now let’s focus on the four tasks outlined in the above figure. This monolithic application will work well with small to medium load. It will run on a single server, connect to a single database and will be written probably in the same programming language. Now, what will happen if the business grows exponentially and hundreds of thousands or millions of users need to be handled and processed? Initially, the short term solution would be to ensure that the server where the application runs, has powerful hardware specifications to withstand higher loads, and if not then add more memory, storage, and processing power to the server. This is called vertical scaling, which is the act of increasing the power of the hardware  like RAM and hard drive capacity to run heavy applications.However, this is typically not  sustainable in the long run as the load on the application continues to grow. Another challenge with monolithic applications is the inflexibility caused by being limited to only one or two programming languages. This inflexibility can affect the overall quality, and efficiency of the application. For example, node.js is a popular JavaScript framework for building web applications, whereas R is popular for data science applications. A monolithic application will make it difficult to utilize both technologies, whereas in a microservices application, we can simply build a data science service written in R and a web service written in Node.js. The microservices version of the events application will take the below form: This application will be capable of scaling among multiple servers, a practice known as horizontal scaling. Each service can be deployed on a different server with dedicated resources or in separate containers (more on that later). The different services can be written in different programming languages enabling greater flexibility, and different dedicated teams can focus on different services achieving more overall quality for the application. Another notable advantage of using microservices is the ease of continuous delivery, which is the ability to deploy software often, and at any time. The reason why microservices make continuous delivery easier is because a new feature deployed to one microservices is less likely to affect other microservices compared to monolithic applications. Issues with Microservices One notable drawback of relying heavily on microservices is the fact that they can become too complicated to manage in the long run as they grow in numbers and scope. There are approaches to mitigate this by utilizing monitoring tools such as Prometheus to detect problems, container technologies such as Docker to avoid pollutions of the host environments and avoiding over designing the services. However, these approaches take effort and time. Cloud native applications Microservices architectures are a natural fit for cloud native applications. A cloud native application is simply defined as an application built from the ground up for cloud computing architectures. This simply means that our application is cloud native, if we design it as if it is expected to be deployed on a distributed, and scalable infrastructure. For example, building an application with a redundant microservices architecture -we’ll see an example shortly- makes the application cloud native, since this architecture allows our application to be deployed in a distributed manner that allows it to be scalable and almost always available. A cloud native application does not need to always be deployed to a public cloud like AWS, we can deploy it to our own distributed cloud-like infrastructure instead if we have one. In fact, what makes an application fully cloud native is beyond just using microservices. Your application  should employ continuous delivery, which is your ability to continuously deliver updates to your production applications without  disruptions. Your application should also make use of services like message queues and technologies like containers, and serverless (containers and serverless are important topics for modern software architectures, so we’ll be discussing them in the next few sections). Cloud native applications assume access to numerous server nodes, having access to pre-deployed software services like message queues or load balancers, ease of integration with continuous delivery services, among other things. If you deploy your cloud native application to a commercial cloud like AWS or Azure, your application gets the option to utilize cloud only software services. For example, DynamoDB is a powerful database engine that can only be used on Amazon Web Services for production applications. Another example is the DocumentDB database in Azure. There are also cloud only message queues such as Amazon Simple Queue Service (SQS), which can be used to allow communication between microservices in the Amazon Web Services cloud. As mentioned earlier, cloud native microservices should be designed to allow redundancy between services. If we take the events booking application as an example, the application will look like this: Multiple server nodes would be allocated per microservice, allowing a redundant microservices architecture to be deployed. If the primary node or service fails for any reason, the secondary can take over ensuring lasting reliability and availability for cloud native applications. This availability is vital for fault intolerant applications such as e-commerce platforms, where downtime translates into large amounts of lost revenue. Cloud native applications provide great value for developers, enterprises, and startups. A notable tool worth mentioning in the world of microservices and cloud computing is Prometheus. Prometheus is an open source system monitoring and alerting tool that can be used to monitor complex microservices architectures and alert when an action needs to be taken. Prometheus was originally created by SoundCloud to monitor their systems, but then grew to become an independent project. The project is now a part of the cloud native computing foundation, which is a foundation tasked with building a sustainable ecosystem for cloud native applications. Cloud native limitations For cloud native applications, you will face some challenges if the need arises to migrate some or all of the applications. That is due to multiple reasons, depending on where your application is deployed. For example,if your cloud native application is deployed on a public cloud like AWS, cloud native APIs are not cross cloud platform. So, a DynamoDB database API utilized in an application will only work on AWS but not on Azure, since DynamoDB belongs exclusively to AWS. The API will also never work in a local environment because DynamoDB can only be utilized in AWS in production. Another reason is because there are some assumptions made when some cloud native applications are built, like the fact that there will be virtually unlimited number of server nodes to utilize when needed and that a new server node can be made available very quickly. These assumptions are sometimes hard to guarantee in a local data center environment, where real servers, networking hardware, and wiring need to be purchased. This brings us to the end of Part 1 of this whitepaper. Check out Part 2 tomorrow to learn about Containers and Serverless applications along with their practical advantages and limitations. About Author: Mina Andrawos Mina Andrawos is an experienced engineer who has developed deep experience in Go from using it personally and professionally. He regularly authors articles and tutorials about the language, and also shares Go's open source projects. He has written numerous Go applications with varying degrees of complexity. Other than Go, he has skills in Java, C#, Python, and C++. He has worked with various databases and software architectures. He is also skilled with the agile methodology for software development. Besides software development, he has working experience of scrum mastering, sales engineering, and software product management. Building microservices from a monolith Java EE app [Tutorial] 6 Ways to blow up your Microservices! Have Microservices killed the monolithic architecture? Maybe not!
Read more
  • 0
  • 0
  • 3666

article-image-tic-tac-toe-game-in-asp-net-core-tutorial
Aaron Lazar
13 Aug 2018
28 min read
Save for later

Building a Tic-tac-toe game in ASP.Net Core 2.0 [Tutorial]

Aaron Lazar
13 Aug 2018
28 min read
Learning is more fun if we do it while making games. With this thought, let's continue our quest to learn .NET Core 2.0 by writing a Tic-tac-toe game in .NET Core 2.0. We will develop the game in the ASP.NET Core 2.0 web app, using SignalR Core. We will follow a step-by-step approach and use Visual Studio 2017 as the primary IDE, but will list the steps needed while using the Visual Studio Code editor as well. Let's do the project setup first and then we will dive into the coding. This tutorial has been extracted from the book .NET Core 2.0 By Example, by Rishabh Verma and Neha Shrivastava. Installing SignalR Core NuGet package Create a new ASP.NET Core 2.0 MVC app named TicTacToeGame. With this, we will have a basic working ASP.NET Core 2.0 MVC app in place. However, to leverage SignalR Core in our app, we need to install SignalR Core NuGet and the client packages. To install the SignalR Core NuGet package, we can perform one of the following two approaches in the Visual Studio IDE: In the context menu of the TicTacToeGame project, click on Manage NuGet Packages. It will open the NuGet Package Manager for the project. In the Browse section, search for the Microsoft.AspNetCore.SignalR package and click Install. This will install SignalR Core in the app. Please note that currently the package is in the preview stage and hence the pre-release checkbox has to be ticked: Edit the TicTacToeGame.csproj file, add the following code snippet in the ItemGroup code containing package references, and click Save. As soon as the file is saved, the tooling will take care of restoring the packages and in a while, the SignalR package will be installed. This approach can be used with Visual Studio Code as well. Although Visual Studio Code detects the unresolved dependencies and may prompt you to restore the package, it is recommended that immediately after editing and saving the file, you run the dotnet restore command in the terminal window at the location of the project: <ItemGroup> <PackageReference Include="Microsoft.AspNetCore.All" Version="2.0.0" /> <PackageReference Include="Microsoft.AspNetCore.SignalR" Version="1.0.0-alpha1-final" /> </ItemGroup> Now we have server-side packages installed. We still need to install the client-side package of SignalR, which is available through npm. To do so, we need to first ascertain whether we have npm installed on the machine or not. If not, we need to install it. npm is distributed with Node.js, so we need to download and install Node.js from https://nodejs.org/en/. The installation is quite straightforward. Once this installation is done, open a Command Prompt at the project location and run the following command: npm install @aspnet/signalr-client This will install the SignalR client package. Just go to the package location (npm creates a node_modules folder in the project directory). The relative path from the project directory would be \node_modules\@aspnet\signalr-client\dist\browser. From this location, copy the signalr-client-1.0.0-alpha1-final.js file into the wwwroot\js folder. In the current version, the name is signalr-client-1.0.0-alpha1-final.js. With this, we are done with the project setup and we are ready to use SignalR goodness as well. So let's dive into the coding. Coding the game In this section, we will implement our gaming solution. The end output will be the working two-player Tic-Tac-Toe game. We will do the coding in steps for ease of understanding:  In the Startup class, we modify the ConfigureServices method to add SignalR to the container, by writing the following code: //// Adds SignalR to the services container. services.AddSignalR(); In the Configure method of the same class, we configure the pipeline to use SignalR and intercept and wire up the request containing gameHub to our SignalR hub that we will be creating with the following code: //// Use - SignalR & let it know to intercept and map any request having gameHub. app.UseSignalR(routes => { routes.MapHub<GameHub>("gameHub"); }); The following is the code for both methods, for the sake of clarity and completion. Other methods and properties are removed for brevity: // This method gets called by the run-time. Use this method to add services to the container. public void ConfigureServices(IServiceCollection services) { services.AddMvc(); //// Adds SignalR to the services container. services.AddSignalR(); } // This method gets called by the runtime. Use this method to configure the HTTP request pipeline. public void Configure(IApplicationBuilder app, IHostingEnvironment env) { if (env.IsDevelopment()) { app.UseDeveloperExceptionPage(); app.UseBrowserLink(); } else { app.UseExceptionHandler("/Home/Error"); } app.UseStaticFiles(); app.UseMvc(routes => { routes.MapRoute( name: "default", template: "{controller=Home}/{action=Index}/{id?}"); }); //// Use - SignalR & let it know to intercept and map any request having gameHub. app.UseSignalR(routes => { routes.MapHub<GameHub>("gameHub"); }); The previous two steps set up SignalR for us. Now, let's start with the coding of the player registration form. We want the player to be registered with a name and display the picture. Later, the server will also need to know whether the player is playing, waiting for a move, searching for an opponent, and so on. Let's create the Player model in the Models folder in the app. The code comments are self-explanatory: /// <summary> /// The player class. Each player of Tic-Tac-Toe game would be an instance of this class. /// </summary> internal class Player { /// <summary> /// Gets or sets the name of the player. This would be set at the time user registers. /// </summary> public string Name { get; set; } /// <summary> /// Gets or sets the opponent player. The player against whom the player would be playing. /// This is determined/ set when the players click Find Opponent Button in the UI. /// </summary> public Player Opponent { get; set; } /// <summary> /// Gets or sets a value indicating whether the player is playing. /// This is set when the player starts a game. /// </summary> public bool IsPlaying { get; set; } /// <summary> /// Gets or sets a value indicating whether the player is waiting for opponent to make a move. /// </summary> public bool WaitingForMove { get; set; } /// <summary> /// Gets or sets a value indicating whether the player is searching for opponent. /// </summary> public bool IsSearchingOpponent { get; set; } /// <summary> /// Gets or sets the time when the player registered. /// </summary> public DateTime RegisterTime { get; set; } /// <summary> /// Gets or sets the image of the player. /// This would be set at the time of registration, if the user selects the image. /// </summary> public string Image { get; set; } /// <summary> /// Gets or sets the connection id of the player connection with the gameHub. /// </summary> public string ConnectionId { get; set; } } Now, we need to have a UI in place so that the player can fill in the form and register. We also need to show the image preview to the player when he/she browses the image. To do so, we will use the Index.cshtml view of the HomeController class that comes with the default MVC template. We will refer to the following two .js files in the _Layout.cshtml partial view so that they are available to all the views. Alternatively, you could add these in the Index.cshtml view as well, but its highly recommended that common scripts should be added in _Layout.cshtml. The version of the script file may be different in your case. These are the currently available latest versions. Although jQuery is not required to be the library of choice for us, we will use jQuery to keep the code clean, simple, and compact. With these references, we have jQuery and SignalR available to us on the client side: <script src="~/lib/jquery/dist/jquery.js"></script> <!-- jQuery--> <script src="~/js/signalr-client-1.0.0-alpha1-final.js"></script> <!-- SignalR--> After adding these references, create the simple HTML UI for the image preview and registration, as follows: <div id="divPreviewImage"> <!-- To display the browsed image--> <fieldset> <div class="form-group"> <div class="col-lg-2"> <image src="" id="previewImage" style="height:100px;width:100px;border:solid 2px dotted; float:left" /> </div> <div class="col-lg-10" id="divOpponentPlayer"> <!-- To display image of opponent player--> <image src="" id="opponentImage" style="height:100px;width:100px;border:solid 2px dotted; float:right;" /> </div> </div> </fieldset> </div> <div id="divRegister"> <!-- Our Registration form--> <fieldset> <legend>Register</legend> <div class="form-group"> <label for="name" class="col-lg-2 control- label">Name</label> <div class="col-lg-10"> <input type="text" class="form-control" id="name" placeholder="Name"> </div> </div> <div class="form-group"> <label for="image" class="col-lg-2 control- label">Avatar</label> <div class="col-lg-10"> <input type="file" class="form-control" id="image" /> </div> </div> <div class="form-group"> <div class="col-lg-10 col-lg-offset-2"> <button type="button" class="btn btn-primary" id="btnRegister">Register</button> </div> </div> </fieldset> </div> When the player registers by clicking the Register button, the player's details need to be sent to the server. To do this, we will write the JavaScript to send details to our gameHub: let hubUrl = '/gameHub'; let httpConnection = new signalR.HttpConnection(hubUrl); let hubConnection = new signalR.HubConnection(httpConnection); var playerName = ""; var playerImage = ""; var hash = "#"; hubConnection.start(); $("#btnRegister").click(function () { //// Fires on button click playerName = $('#name').val(); //// Sets the player name with the input name. playerImage = $('#previewImage').attr('src'); //// Sets the player image variable with specified image var data = playerName.concat(hash, playerImage); //// The registration data to be sent to server. hubConnection.invoke('RegisterPlayer', data); //// Invoke the "RegisterPlayer" method on gameHub. }); $("#image").change(function () { //// Fires when image is changed. readURL(this); //// HTML 5 way to read the image as data url. }); function readURL(input) { if (input.files && input.files[0]) { //// Go in only if image is specified. var reader = new FileReader(); reader.onload = imageIsLoaded; reader.readAsDataURL(input.files[0]); } } function imageIsLoaded(e) { if (e.target.result) { $('#previewImage').attr('src', e.target.result); //// Sets the image source for preview. $("#divPreviewImage").show(); } }; The player now has a UI to input the name and image, see the preview image, and click Register. On clicking the Register button, we are sending the concatenated name and image to the gameHub on the server through hubConnection.invoke('RegisterPlayer', data);  So, it's quite simple for the client to make a call to the server. Initialize the hubConnection by specifying hub name as we did in the first three lines of the preceding code snippet. Start the connection by hubConnection.start();, and then invoke the server hub method by calling the invoke method, specifying the hub method name and the parameter it expects. We have not yet created the hub, so let's create the GameHub class on the server: /// <summary> /// The Game Hub class derived from Hub /// </summary> public class GameHub : Hub { /// <summary> /// To keep the list of all the connected players registered with the game hub. We could have /// used normal list but used concurrent bag as its thread safe. /// </summary> private static readonly ConcurrentBag<Player> players = new ConcurrentBag<Player>(); /// <summary> /// Registers the player with name and image. /// </summary> /// <param name="nameAndImageData">The name and image data sent by the player.</param> public void RegisterPlayer(string nameAndImageData) { var splitData = nameAndImageData?.Split(new char[] { '#' }, StringSplitOptions.None); string name = splitData[0]; string image = splitData[1]; var player = players?.FirstOrDefault(x => x.ConnectionId == Context.ConnectionId); if (player == null) { player = new Player { ConnectionId = Context.ConnectionId, Name = name, IsPlaying = false, IsSearchingOpponent = false, RegisterTime = DateTime.UtcNow, Image = image }; if (!players.Any(j => j.Name == name)) { players.Add(player); } } this.OnRegisterationComplete(Context.ConnectionId); } /// <summary> /// Fires on completion of registration. /// </summary> /// <param name="connectionId">The connectionId of the player which registered</param> public void OnRegisterationComplete(string connectionId) { //// Notify this connection id that the registration is complete. this.Clients.Client(connectionId). InvokeAsync(Constants.RegistrationComplete); } } The code comments make it self-explanatory. The class should derive from the SignalR Hub class for it to be recognized as Hub. There are two methods of interest which can be overridden. Notice that both the methods follow the async pattern and hence return Task: Task OnConnectedAsync(): This method fires when a client/player connects to the hub. Task OnDisconnectedAsync(Exception exception): This method fires when a client/player disconnects or looses the connection. We will override this method to handle the scenario where the player disconnects. There are also a few properties that the hub class exposes: Context: This property is of type HubCallerContext and gives us access to the following properties: Connection: Gives access to the current connection User: Gives access to the ClaimsPrincipal of the user who is currently connected ConnectionId: Gives the current connection ID string Clients: This property is of type IHubClients and gives us the way to communicate to all the clients via the client proxy Groups: This property is of type IGroupManager and provides a way to add and remove connections to the group asynchronously To keep the things simple, we are not using a database to keep track of our registered players. Rather we will use an in-memory collection to keep the registered players. We could have used a normal list of players, such as List<Player>, but then we would need all the thread safety and use one of the thread safety primitives, such as lock, monitor, and so on, so we are going with ConcurrentBag<Player>, which is thread safe and reasonable for our game development. That explains the declaration of the players collection in the class. We will need to do some housekeeping to add players to this collection when they resister and remove them when they disconnect. We saw in previous step that the client invoked the RegisterPlayer method of the hub on the server, passing in the name and image data. So we defined a public method in our hub, named RegisterPlayer, accepting the name and image data string concatenated through #. This is just one of the simple ways of accepting the client data for demonstration purposes, we can also use strongly typed parameters. In this method, we split the string on # and extract the name as the first part and the image as the second part. We then check if the player with the current connection ID already exists in our players collection. If it doesn't, we create a Player object with default values and add them to our players collection. We are distinguishing the player based on the name for demonstration purposes, but we can add an Id property in the Player class and make different players have the same name also. After the registration is complete, the server needs to update the player, that the registration is complete and the player can then look for the opponent. To do so, we make a call to the OnRegistrationComplete method which invokes a method called  registrationComplete on the client with the current connection ID. Let's understand the code to invoke the method on the client: this.Clients.Client(connectionId).InvokeAsync(Constants.RegistrationComplete); On the Clients property, we can choose a client having a specific connection ID (in this case, the current connection ID from the Context) and then call InvokeAsync to invoke a method on the client specifying the method name and parameters as required. In the preceding case method, the name is registrationComplete with no parameters. Now we know how to invoke a server method from the client and also how to invoke the client method from the server. We also know how to select a specific client and invoke a method there. We can invoke the client method from the server, for all the clients, a group of clients, or a specific client, so rest of the coding stuff would be just a repetition of these two concepts. Next, we need to implement the registrationComplete method on the client. On registration completion, the registration form should be hidden and the player should be able to find an opponent to play against. To do so, we would write JavaScript code to hide the registration form and show the UI for finding the opponent. On clicking the Find Opponent button, we need the server to pair us against an opponent, so we need to invoke a hub method on server to find opponent. The server can respond us with two outcomes: It finds an opponent player to play against. In this case, the game can start so we need to simulate the coin toss, determine the player who can make the first move, and start the game. This would be a game board in the client-user interface. It doesn't find an opponent and asks the player to wait for another player to register and search for an opponent. This would be a no opponent found screen in the client. In both the cases, the server would do some processing and invoke a method on the client. Since we need a lot of different user interfaces for different scenarios, let's code the HTML markup inside div to make it easier to show and hide sections based on the server response. We will add the following code snippet in the body. The comments specify the purpose of each of the div elements and markup inside them: <div id="divFindOpponentPlayer"> <!-- Section to display Find Opponent --> <fieldset> <legend>Find a player to play against!</legend> <div class="form-group"> <input type="button" class="btn btn-primary" id="btnFindOpponentPlayer" value="Find Opponent Player" /> </div> </fieldset> </div> <div id="divFindingOpponentPlayer"> <!-- Section to display opponent not found, wait --> <fieldset> <legend>Its lonely here!</legend> <div class="form-group"> Looking for an opponent player. Waiting for someone to join! </div> </fieldset> </div> <div id="divGameInformation" class="form-group"> <!-- Section to display game information--> <div class="form-group" id="divGameInfo"></div> <div class="form-group" id="divInfo"></div> </div> <div id="divGame" style="clear:both"> <!-- Section where the game board would be displayed --> <fieldset> <legend>Game On</legend> <div id="divGameBoard" style="width:380px"></div> </fieldset> </div> The following client-side code would take care of Steps 7 and 8. Though the comments are self-explanatory, we will quickly see what all stuff is that is going on here. We handle the registartionComplete method and display the Find Opponent Player section. This section has a button to find an opponent player called btnFindOpponentPlayer. We define the event handler of the button to invoke the FindOpponent method on the hub. We will see the hub method implementation later, but we know that the hub method would either find an opponent or would not find an opponent, so we have defined the methods opponentFound and opponentNotFound, respectively, to handle these scenarios. In the opponentNotFound method, we just display a section in which we say, we do not have an opponent player. In the opponentFound method, we display the game section, game information section, opponent display picture section, and draw the Tic-Tac-Toe game board as a 3×3 grid using CSS styling. All the other sections are hidden: $("#btnFindOpponentPlayer").click(function () { hubConnection.invoke('FindOpponent'); }); hubConnection.on('registrationComplete', data => { //// Fires on registration complete. Invoked by server hub $("#divRegister").hide(); // hide the registration div $("#divFindOpponentPlayer").show(); // display find opponent player div. }); hubConnection.on('opponentNotFound', data => { //// Fires when no opponent is found. $('#divFindOpponentPlayer').hide(); //// hide the find opponent player section. $('#divFindingOpponentPlayer').show(); //// display the finding opponent player div. }); hubConnection.on('opponentFound', (data, image) => { //// Fires when opponent player is found. $('#divFindOpponentPlayer').hide(); $('#divFindingOpponentPlayer').hide(); $('#divGame').show(); //// Show game board section. $('#divGameInformation').show(); //// Show game information $('#divOpponentPlayer').show(); //// Show opponent player image. opponentImage = image; //// sets the opponent player image for display $('#opponentImage').attr('src', opponentImage); //// Binds the opponent player image $('#divGameInfo').html("<br/><span><strong> Hey " + playerName + "! You are playing against <i>" + data + "</i> </strong></span>"); //// displays the information of opponent that the player is playing against. //// Draw the tic-tac-toe game board, A 3x3 grid :) by proper styling. for (var i = 0; i < 9; i++) { $("#divGameBoard").append("<span class='marker' id=" + i + " style='display:block;border:2px solid black;height:100px;width:100px;float:left;margin:10px;'>" + i + "</span>"); } }); First we need to have a Game object to track a game, players involved, moves left, and check if there is a winner. We will have a Game class defined as per the following code. The comments detail the purpose of the methods and the properties defined: internal class Game { /// <summary> /// Gets or sets the value indicating whether the game is over. /// </summary> public bool IsOver { get; private set; } /// <summary> /// Gets or sets the value indicating whether the game is draw. /// </summary> public bool IsDraw { get; private set; } /// <summary> /// Gets or sets Player 1 of the game /// </summary> public Player Player1 { get; set; } /// <summary> /// Gets or sets Player 2 of the game /// </summary> public Player Player2 { get; set; } /// <summary> /// For internal housekeeping, To keep track of value in each of the box in the grid. /// </summary> private readonly int[] field = new int[9]; /// <summary> /// The number of moves left. We start the game with 9 moves remaining in a 3x3 grid. /// </summary> private int movesLeft = 9; /// <summary> /// Initializes a new instance of the <see cref="Game"/> class. /// </summary> public Game() { //// Initialize the game for (var i = 0; i < field.Length; i++) { field[i] = -1; } } /// <summary> /// Place the player number at a given position for a player /// </summary> /// <param name="player">The player number would be 0 or 1</param> /// <param name="position">The position where player number would be placed, should be between 0 and ///8, both inclusive</param> /// <returns>Boolean true if game is over and we have a winner.</returns> public bool Play(int player, int position) { if (this.IsOver) { return false; } //// Place the player number at the given position this.PlacePlayerNumber(player, position); //// Check if we have a winner. If this returns true, //// game would be over and would have a winner, else game would continue. return this.CheckWinner(); } } Now we have the entire game mystery solved with the Game class. We know when the game is over, we have the method to place the player marker, and check the winner. The following server side-code on the GameHub will handle Steps 7 and 8: /// <summary> /// The list of games going on. /// </summary> private static readonly ConcurrentBag<Game> games = new ConcurrentBag<Game>(); /// <summary> /// To simulate the coin toss. Like heads and tails, 0 belongs to one player and 1 to opponent. /// </summary> private static readonly Random toss = new Random(); /// <summary> /// Finds the opponent for the player and sets the Seraching for Opponent property of player to true. /// We will use the connection id from context to identify the current player. /// Once we have 2 players looking to play, we can pair them and simulate coin toss to start the game. /// </summary> public void FindOpponent() { //// First fetch the player from our players collection having current connection id var player = players.FirstOrDefault(x => x.ConnectionId == Context.ConnectionId); if (player == null) { //// Since player would be registered before making this call, //// we should not reach here. If we are here, something somewhere in the flow above is broken. return; } //// Set that player is seraching for opponent. player.IsSearchingOpponent = true; //// We will follow a queue, so find a player who registered earlier as opponent. //// This would only be the case if more than 2 players are looking for opponent. var opponent = players.Where(x => x.ConnectionId != Context.ConnectionId && x.IsSearchingOpponent && !x.IsPlaying).OrderBy(x =>x.RegisterTime).FirstOrDefault(); if (opponent == null) { //// Could not find any opponent, invoke opponentNotFound method in the client. Clients.Client(Context.ConnectionId) .InvokeAsync(Constants.OpponentNotFound); return; } //// Set both players as playing. player.IsPlaying = true; player.IsSearchingOpponent = false; //// Make him unsearchable for opponent search opponent.IsPlaying = true; opponent.IsSearchingOpponent = false; //// Set each other as opponents. player.Opponent = opponent; opponent.Opponent = player; //// Notify both players that they can play by invoking opponentFound method for both the players. //// Also pass the opponent name and opoonet image, so that they can visualize it. //// Here we are directly using connection id, but group is a good candidate and use here. Clients.Client(Context.ConnectionId) .InvokeAsync(Constants.OpponentFound, opponent.Name, opponent.Image); Clients.Client(opponent.ConnectionId) .InvokeAsync(Constants.OpponentFound, player.Name, player.Image); //// Create a new game with these 2 player and add it to games collection. games.Add(new Game { Player1 = player, Player2 = opponent }); } Here, we have created a games collection to keep track of ongoing games and a Random field named toss to simulate the coin toss. How FindOpponent works is documented in the comments and is intuitive to understand. Once the game starts, each player has to make a move and then wait for the opponent to make a move, until the game ends. The move is made by clicking on the available grid cells. Here, we need to ensure that cell position that is already marked by one of the players is not changed or marked. So, as soon as a valid cell is marked, we set its CSS class to notAvailable so we know that the cell is taken. While clicking on a cell, we will check whether the cell has notAvailablestyle. If yes, it cannot be marked. If not, the cell can be marked and we then send the marked position to the server hub. We also see the waitingForMove, moveMade, gameOver, and opponentDisconnected events invoked by the server based on the game state. The code is commented and is pretty straightforward. The moveMade method in the following code makes use of the MoveInformation class, which we will define at the server for sharing move information with both players: //// Triggers on clicking the grid cell. $(document).on('click', '.marker', function () { if ($(this).hasClass("notAvailable")) { //// Cell is already taken. return; } hubConnection.invoke('MakeAMove', $(this)[0].id); //// Cell is valid, send details to hub. }); //// Fires when player has to make a move. hubConnection.on('waitingForMove', data => { $('#divInfo').html("<br/><span><strong> Your turn <i>" + playerName + "</i>! Make a winning move! </strong></span>"); }); //// Fires when move is made by either player. hubConnection.on('moveMade', data => { if (data.Image == playerImage) { //// Move made by player. $("#" + data.ImagePosition).addClass("notAvailable"); $("#" + data.ImagePosition).css('background-image', 'url(' + data.Image + ')'); $('#divInfo').html("<br/><strong>Waiting for <i>" + data.OpponentName + "</i> to make a move. </strong>"); } else { $("#" + data.ImagePosition).addClass("notAvailable"); $("#" + data.ImagePosition).css('background-image', 'url(' + data.Image + ')'); $('#divInfo').html("<br/><strong>Waiting for <i>" + data.OpponentName + "</i> to make a move. </strong>"); } }); //// Fires when the game ends. hubConnection.on('gameOver', data => { $('#divGame').hide(); $('#divInfo').html("<br/><span><strong>Hey " + playerName + "! " + data + " </strong></span>"); $('#divGameBoard').html(" "); $('#divGameInfo').html(" "); $('#divOpponentPlayer').hide(); }); //// Fires when the opponent disconnects. hubConnection.on('opponentDisconnected', data => { $("#divRegister").hide(); $('#divGame').hide(); $('#divGameInfo').html(" "); $('#divInfo').html("<br/><span><strong>Hey " + playerName + "! Your opponent disconnected or left the battle! You are the winner ! Hip Hip Hurray!!!</strong></span>"); }); After every move, both players need to be updated by the server about the move made, so that both players' game boards are in sync. So, on the server side we will need an additional model called MoveInformation, which will contain information on the latest move made by the player and the server will send this model to both the clients to keep them in sync: /// <summary> /// While playing the game, players would make moves. This class contains the information of those moves. /// </summary> internal class MoveInformation { /// <summary> /// Gets or sets the opponent name. /// </summary> public string OpponentName { get; set; } /// <summary> /// Gets or sets the player who made the move. /// </summary> public string MoveMadeBy { get; set; } /// <summary> /// Gets or sets the image position. The position in the game board (0-8) where the player placed his /// image. /// </summary> public int ImagePosition { get; set; } /// <summary> /// Gets or sets the image. The image of the player that he placed in the board (0-8) /// </summary> public string Image { get; set; } } Finally, we will wire up the remaining methods in the GameHub class to complete the game coding. The MakeAMove method is called every time a player makes a move. Also, we have overidden the OnDisconnectedAsync method to inform a player when their opponent disconnects. In this method, we also keep our players and games list current. The comments in the code explain the workings of the methods: /// <summary> /// Invoked by the player to make a move on the board. /// </summary> /// <param name="position">The position to place the player</param> public void MakeAMove(int position) { //// Lets find a game from our list of games where one of the player has the same connection Id as the current connection has. var game = games?.FirstOrDefault(x => x.Player1.ConnectionId == Context.ConnectionId || x.Player2.ConnectionId == Context.ConnectionId); if (game == null || game.IsOver) { //// No such game exist! return; } //// Designate 0 for player 1 int symbol = 0; if (game.Player2.ConnectionId == Context.ConnectionId) { //// Designate 1 for player 2. symbol = 1; } var player = symbol == 0 ? game.Player1 : game.Player2; if (player.WaitingForMove) { return; } //// Update both the players that move is made. Clients.Client(game.Player1.ConnectionId) .InvokeAsync(Constants.MoveMade, new MoveInformation { OpponentName = player.Name, ImagePosition = position, Image = player.Image }); Clients.Client(game.Player2.ConnectionId) .InvokeAsync(Constants.MoveMade, new MoveInformation { OpponentName = player.Name, ImagePosition = position, Image = player.Image }); //// Place the symbol and look for a winner after every move. if (game.Play(symbol, position)) { Remove<Game>(games, game); Clients.Client(game.Player1.ConnectionId) .InvokeAsync(Constants.GameOver, $"The winner is {player.Name}"); Clients.Client(game.Player2.ConnectionId) .InvokeAsync(Constants.GameOver, $"The winner is {player.Name}"); player.IsPlaying = false; player.Opponent.IsPlaying = false; this.Clients.Client(player.ConnectionId) .InvokeAsync(Constants.RegistrationComplete); this.Clients.Client(player.Opponent.ConnectionId) .InvokeAsync(Constants.RegistrationComplete); } //// If no one won and its a tame draw, update the players that the game is over and let them look for new game to play. if (game.IsOver && game.IsDraw) { Remove<Game>(games, game); Clients.Client(game.Player1.ConnectionId) .InvokeAsync(Constants.GameOver, "Its a tame draw!!!"); Clients.Client(game.Player2.ConnectionId) .InvokeAsync(Constants.GameOver, "Its a tame draw!!!"); player.IsPlaying = false; player.Opponent.IsPlaying = false; this.Clients.Client(player.ConnectionId) .InvokeAsync(Constants.RegistrationComplete); this.Clients.Client(player.Opponent.ConnectionId) .InvokeAsync(Constants.RegistrationComplete); } if (!game.IsOver) { player.WaitingForMove = !player.WaitingForMove; player.Opponent.WaitingForMove = !player.Opponent.WaitingForMove; Clients.Client(player.Opponent.ConnectionId) .InvokeAsync(Constants.WaitingForOpponent, player.Opponent.Name); Clients.Client(player.ConnectionId) .InvokeAsync(Constants.WaitingForOpponent, player.Opponent.Name); } } With this, we are done with the coding of the game and are ready to run the game app. So there you have it! You've just built your first game in .NET Core! The detailed source code can be downloaded from Github. If you're interested in learning more, head on over to get the book, .NET Core 2.0 By Example, by Rishabh Verma and Neha Shrivastava. Applying Single Responsibility principle from SOLID in .NET Core Unit Testing in .NET Core with Visual Studio 2017 for better code quality Get to know ASP.NET Core Web API [Tutorial]
Read more
  • 0
  • 1
  • 13593

article-image-web-services-functional-python-programming-tutorial
Aaron Lazar
12 Aug 2018
18 min read
Save for later

Writing web services with functional Python programming [Tutorial]

Aaron Lazar
12 Aug 2018
18 min read
In this article we'll understand how functional programming can be applied to web services in Python. This article is an extract from the 2nd edition of the bestseller, Functional Python Programming, written by Steven Lott. We'll look at a RESTful web service, which can slice and dice a source of data and provide downloads as JSON, XML, or CSV files. We'll provide an overall WSGI-compatible wrapper. The functions that do the real work of the application won't be narrowly constrained to fit the WSGI standard. We'll use a simple dataset with four subcollections: the Anscombe Quartet. It's a small set of data but it can be used to show the principles of a RESTful web service. We'll split our application into two tiers: a web tier, which will be a simple WSGI application, and data service tier, which will be more typical functional programming. We'll look at the web tier first so that we can focus on a functional approach to provide meaningful results. We need to provide two pieces of information to the web service: The quartet that we want: this is a slice and dice operation. The idea is to slice up the information by filtering and extracting meaningful subsets. The output format we want. The data selection is commonly done through the request path. We can request /anscombe/I/ or /anscombe/II/ to pick specific datasets from the quartet. The idea is that a URL defines a resource, and there's no good reason for the URL to ever change. In this case, the dataset selectors aren't dependent on dates or some organizational approval status, or other external factors. The URL is timeless and absolute. The output format is not a first-class part of the URL. It's just a serialization format, not the data itself. In some cases, the format is requested through the HTTP Accept header. This is hard to use from a browser, but easy to use from an application using a RESTful API. When extracting data from the browser, a query string is commonly used to specify the output format. We'll use the ?form=json method at the end of the path to specify the JSON output format. A URL we can use will look like this: http://localhost:8080/anscombe/III/?form=csv This would request a CSV download of the third dataset. Creating the Web Server Gateway Interface First, we'll use a simple URL pattern-matching expression to define the one and only routing in our application. In a larger or more complex application, we might have more than one such pattern: import re path_pat= re.compile(r"^/anscombe/(?P<dataset>.*?)/?$") This pattern allows us to define an overall script in the WSGI sense at the top level of the path. In this case, the script is anscombe. We'll take the next level of the path as a dataset to select from the Anscombe Quartet. The dataset value should be one of I, II, III, or IV. We used a named parameter for the selection criteria. In many cases, RESTful APIs are described using a syntax, as follows: /anscombe/{dataset}/ We translated this idealized pattern into a proper, regular expression, and preserved the name of the dataset selector in the path. Here are some example URL paths that demonstrate how this pattern works: >>> m1 = path_pat.match( "/anscombe/I" ) >>> m1.groupdict() {'dataset': 'I'} >>> m2 = path_pat.match( "/anscombe/II/" ) >>> m2.groupdict() {'dataset': 'II'} >>> m3 = path_pat.match( "/anscombe/" ) >>> m3.groupdict() {'dataset': ''} Each of these examples shows the details parsed from the URL path. When a specific series is named, this is located in the path. When no series is named, then an empty string is found by the pattern. Here's the overall WSGI application: import traceback import urllib.parse def anscombe_app( environ: Dict, start_response: SR_Func ) -> Iterable[bytes]: log = environ['wsgi.errors'] try: match = path_pat.match(environ['PATH_INFO']) set_id = match.group('dataset').upper() query = urllib.parse.parse_qs(environ['QUERY_STRING']) print(environ['PATH_INFO'], environ['QUERY_STRING'], match.groupdict(), file=log) dataset = anscombe_filter(set_id, raw_data()) content_bytes, mime = serialize( query['form'][0], set_id, dataset) headers = [ ('Content-Type', mime), ('Content-Length', str(len(content_bytes))), ] start_response("200 OK", headers) return [content_bytes] except Exception as e: # pylint: disable=broad-except traceback.print_exc(file=log) tb = traceback.format_exc() content = error_page.substitute( title="Error", message=repr(e), traceback=tb) content_bytes = content.encode("utf-8") headers = [ ('Content-Type', "text/html"), ('Content-Length', str(len(content_bytes))), ] start_response("404 NOT FOUND", headers) return [content_bytes] This application will extract two pieces of information from the request: the PATH_INFO and the QUERY_STRING keys in the environment dictionary. The PATH_INFO request will define which set to extract. The QUERY_STRING request will specify an output format. It's important to note that query strings can be quite complex. Rather than assume it is simply a string like ?form=json, we've used the urllib.parse module to properly locate all of the name-value pairs in the query string. The value with the 'form' key in the dictionary extracted from the query string can be found in query['form'][0]. This should be one of the defined formats. If it isn't, an exception will be raised, and an error page displayed. After locating the path and query string, the application processing is highlighted in bold. These two statements rely on three functions to gather, filter, and serialize the results: The raw_data() function reads the raw data from a file. The result is a dictionary with lists of Pair objects. The anscombe_filter() function accepts a selection string and the dictionary of raw data and returns a single list of Pair objects. The list of pairs is then serialized into bytes by the serialize() function. The serializer is expected to produce byte's, which can then be packaged with an appropriate header, and returned. We elected to produce an HTTP Content-Length header as part of the result. This header isn't required, but it's polite for large downloads. Because we decided to emit this header, we are forced to create a bytes object with the serialization of the data so we can count the bytes. If we elected to omit the Content-Length header, we could change the structure of this application dramatically. Each serializer could be changed to a generator function, which would yield bytes as they are produced. For large datasets, this can be a helpful optimization. For the user watching a download, however, it might not be so pleasant because the browser can't display how much of the download is complete. A common optimization is to break the transaction into two parts. The first part computes the result and places a file into a Downloads directory. The response is a 302 FOUND with a Location header that identifies the file to download. Generally, most clients will then request the file based on this initial response. The file can be downloaded by Apache httpd or Nginx without involving the Python application. For this example, all errors are treated as a 404 NOT FOUND error. This could be misleading, since a number of individual things might go wrong. More sophisticated error handling could give more try:/except: blocks to provide more informative feedback. For debugging purposes, we've provided a Python stack trace in the resulting web page. Outside the context of debugging, this is a very bad idea. Feedback from an API should be just enough to fix the request, and nothing more. A stack trace provides too much information to potentially malicious users. Getting raw data Here's what we're using for this application: from Chapter_3.ch03_ex5 import ( series, head_map_filter, row_iter) from typing import ( NamedTuple, Callable, List, Tuple, Iterable, Dict, Any) RawPairIter = Iterable[Tuple[float, float]] class Pair(NamedTuple): x: float y: float pairs: Callable[[RawPairIter], List[Pair]] \ = lambda source: list(Pair(*row) for row in source) def raw_data() -> Dict[str, List[Pair]]: with open("Anscombe.txt") as source: data = tuple(head_map_filter(row_iter(source))) mapping = { id_str: pairs(series(id_num, data)) for id_num, id_str in enumerate( ['I', 'II', 'III', 'IV']) } return mapping The raw_data() function opens the local data file, and applies the row_iter() function to return each line of the file parsed into a row of separate items. We applied the head_map_filter() function to remove the heading from the file. The result created a tuple-of-list structure, which is assigned the variable data. This handles parsing the input into a structure that's useful. The resulting structure is an instance of the Pair subclass of the NamedTuple class, with two fields that have float as their type hints. We used a dictionary comprehension to build the mapping from id_str to pairs assembled from the results of the series() function. The series() function extracts (x, y) pairs from the input document. In the document, each series is in two adjacent columns. The series named I is in columns zero and one; the series() function extracts the relevant column pairs. The pairs() function is created as a lambda object because it's a small generator function with a single parameter. This function builds the desired NamedTuple objects from the sequence of anonymous tuples created by the series() function. Since the output from the raw_data() function is a mapping, we can do something like the following example to pick a specific series by name: >>> raw_data()['I'] [Pair(x=10.0, y=8.04), Pair(x=8.0, y=6.95), ... Given a key, for example, 'I', the series is a list of Pair objects that have the x, y values for each item in the series. Applying a filter In this application, we're using a simple filter. The entire filter process is embodied in the following function: def anscombe_filter( set_id: str, raw_data_map: Dict[str, List[Pair]] ) -> List[Pair]: return raw_data_map[set_id] We made this trivial expression into a function for three reasons: The functional notation is slightly more consistent and a bit more flexible than the subscript expression We can easily expand the filtering to do more We can include separate unit tests in the docstring for this function While a simple lambda would work, it wouldn't be quite as convenient to test. For error handling, we've done exactly nothing. We've focused on what's sometimes called the happy path: an ideal sequence of events. Any problems that arise in this function will raise an exception. The WSGI wrapper function should catch all exceptions and return an appropriate status message and error response content. For example, it's possible that the set_id method will be wrong in some way. Rather than obsess over all the ways it could be wrong, we'll simply allow Python to throw an exception. Indeed, this function follows the Python advice that, it's better to seek forgiveness than to ask permission. This advice is materialized in code by avoiding permission-seeking: there are no preparatory if statements that seek to qualify the arguments as valid. There is only forgiveness handling: an exception will be raised and handled in the WSGI wrapper. This essential advice applies to the preceding raw data and the serialization that we will see now. Serializing the results Serialization is the conversion of Python data into a stream of bytes, suitable for transmission. Each format is best described by a simple function that serializes just that one format. A top-level generic serializer can then pick from a list of specific serializers. The picking of serializers leads to the following collection of functions: Serializer = Callable[[str, List[Pair]], bytes] SERIALIZERS: Dict[str, Tuple[str, Serializer]]= { 'xml': ('application/xml', serialize_xml), 'html': ('text/html', serialize_html), 'json': ('application/json', serialize_json), 'csv': ('text/csv', serialize_csv), } def serialize( format: str, title: str, data: List[Pair] ) -> Tuple[bytes, str]: mime, function = SERIALIZERS.get( format.lower(), ('text/html', serialize_html)) return function(title, data), mime The overall serialize() function locates a specific serializer in the SERIALIZERS dictionary, which maps a format name to a two-tuple. The tuple has a MIME type that must be used in the response to characterize the results. The tuple also has a function based on the Serializer type hint. This function will transform a name and a list of Pair objects into bytes that will be downloaded. The serialize() function doesn't do any data transformation. It merely maps a name to a function that does the hard work of transformation. Returning a function permits the overall application to manage the details of memory or file-system serialization. Serializing to the file system, while slow, permits larger files to be handled. We'll look at the individual serializers below. The serializers fall into two groups: those that produce strings and those that produce bytes. A serializer that produces a string will need to have the string encoded as bytes for download. A serializer that produces bytes doesn't need any further work. For the serializers, which produce strings, we can use function composition with a standardized convert-to-bytes function. Here's a decorator that can standardize the conversion to bytes: from typing import Callable, TypeVar, Any, cast from functools import wraps def to_bytes( function: Callable[..., str] ) -> Callable[..., bytes]: @wraps(function) def decorated(*args, **kw): text = function(*args, **kw) return text.encode("utf-8") return cast(Callable[..., bytes], decorated) We've created a small decorator named @to_bytes. This will evaluate the given function and then encode the results using UTF-8 to get bytes. Note that the decorator changes the decorated function from having a return type of str to a return type of bytes. We haven't formally declared parameters for the decorated function, and used ... instead of the details. We'll show how this is used with JSON, CSV, and HTML serializers. The XML serializer produces bytes directly and doesn't need to be composed with this additional function. We could also do the functional composition in the initialization of the serializers mapping. Instead of decorating the function definition, we could decorate the reference to the function object. Here's an alternative definition for the serializer mapping: SERIALIZERS = { 'xml': ('application/xml', serialize_xml), 'html': ('text/html', to_bytes(serialize_html)), 'json': ('application/json', to_bytes(serialize_json)), 'csv': ('text/csv', to_bytes(serialize_csv)), } This replaces decoration at the site of the function definition with decoration when building this mapping data structure. It seems potentially confusing to defer the decoration. Serializing data into JSON or CSV formats The JSON and CSV serializers are similar because both rely on Python's libraries to serialize. The libraries are inherently imperative, so the function bodies are strict sequences of statements. Here's the JSON serializer: import json @to_bytes def serialize_json(series: str, data: List[Pair]) -> str: """ >>> data = [Pair(2,3), Pair(5,7)] >>> serialize_json( "test", data ) b'[{"x": 2, "y": 3}, {"x": 5, "y": 7}]' """ obj = [dict(x=r.x, y=r.y) for r in data] text = json.dumps(obj, sort_keys=True) return text We created a list-of-dict structure and used the json.dumps() function to create a string representation. The JSON module requires a materialized list object; we can't provide a lazy generator function. The sort_keys=True argument value is helpful for unit testing. However, it's not required for the application and represents a bit of overhead. Here's the CSV serializer: import csv import io @to_bytes def serialize_csv(series: str, data: List[Pair]) -> str: """ >>> data = [Pair(2,3), Pair(5,7)] >>> serialize_csv("test", data) b'x,y\\r\\n2,3\\r\\n5,7\\r\\n' """ buffer = io.StringIO() wtr = csv.DictWriter(buffer, Pair._fields) wtr.writeheader() wtr.writerows(r._asdict() for r in data) return buffer.getvalue() The CSV module's readers and writers are a mixture of imperative and functional elements. We must create the writer, and properly create headings in a strict sequence. We've used the _fields attribute of the Pair namedtuple to determine the column headings for the writer. The writerows() method of the writer will accept a lazy generator function. In this case, we used the _asdict() method of each Pair object to return a dictionary suitable for use with the CSV writer. Serializing data into XML We'll look at one approach to XML serialization using the built-in libraries. This will build a document from individual tags. A common alternative approach is to use Python introspection to examine and map Python objects and class names to XML tags and attributes. Here's our XML serialization: import xml.etree.ElementTree as XML def serialize_xml(series: str, data: List[Pair]) -> bytes: """ >>> data = [Pair(2,3), Pair(5,7)] >>> serialize_xml( "test", data ) b'<series name="test"><row><x>2</x><y>3</y></row><row><x>5</x><y>7</y></row></series>' """ doc = XML.Element("series", name=series) for row in data: row_xml = XML.SubElement(doc, "row") x = XML.SubElement(row_xml, "x") x.text = str(row.x) y = XML.SubElement(row_xml, "y") y.text = str(row.y) return cast(bytes, XML.tostring(doc, encoding='utf-8')) We created a top-level element, <series>, and placed <row> sub-elements underneath that top element. Within each <row> sub-element, we've created <x> and <y> tags, and assigned text content to each tag. The interface for building an XML document using the ElementTree library tends to be heavily imperative. This makes it a poor fit for an otherwise functional design. In addition to the imperative style, note that we haven't created a DTD or XSD. We have not properly assigned a namespace to our tags. We also omitted the <?xml version="1.0"?> processing instruction that is generally the first item in an XML document. The XML.tostring() function has a type hint that states it returns str. This is generally true, but when we provide the encoding parameter, the result type changes to bytes. There's no easy way to formalize the idea of variant return types based on parameter values, so we use an explicit cast() to inform mypy of the actual type. A more sophisticated serialization library could be helpful here. There are many to choose from. Visit https://wiki.python.org/moin/PythonXml for a list of alternatives. Serializing data into HTML In our final example of serialization, we'll look at the complexity of creating an HTML document. The complexity arises because in HTML, we're expected to provide an entire web page with a great deal of context information. Here's one way to tackle this HTML problem: import string data_page = string.Template("""\ <html> <head><title>Series ${title}</title></head> <body> <h1>Series ${title}</h1> <table> <thead><tr><td>x</td><td>y</td></tr></thead> <tbody> ${rows} </tbody> </table> </body> </html> """) @to_bytes def serialize_html(series: str, data: List[Pair]) -> str: """ >>> data = [Pair(2,3), Pair(5,7)] >>> serialize_html("test", data) #doctest: +ELLIPSIS b'<html>...<tr><td>2</td><td>3</td></tr>\\n<tr><td>5</td><td>7</td></tr>... """ text = data_page.substitute( title=series, rows="\n".join( "<tr><td>{0.x}</td><td>{0.y}</td></tr>".format(row) for row in data) ) return text Our serialization function has two parts. The first part is a string.Template() function that contains the essential HTML page. It has two placeholders where data can be inserted into the template. The ${title} method shows where title information can be inserted, and the ${rows} method shows where the data rows can be inserted. The function creates individual data rows using a simple format string. These are joined into a longer string, which is then substituted into the template. While workable for simple cases like the preceding example, this isn't ideal for more complex result sets. There are a number of more sophisticated template tools to create HTML pages. A number of these include the ability to embed the looping in the template, separate from the function that initializes serialization. If you found this tutorial useful and would like to learn more such techniques, head over to get Steven Lott's bestseller, Functional Python Programming. What is the difference between functional and object-oriented programming? Should you move to Python 3? 7 Python experts’ opinions Is Python edging R out in the data science wars?
Read more
  • 0
  • 0
  • 4477
article-image-implement-rnn-tensorflow-spam-prediction-tutorial
Packt Editorial Staff
11 Aug 2018
11 min read
Save for later

Implementing RNN in TensorFlow for spam prediction [Tutorial]

Packt Editorial Staff
11 Aug 2018
11 min read
Artificial neural networks (ANN) are an abstract representation of the human nervous system, which contains a collection of neurons that communicate with each other through connections called axons. A recurrent neural network (RNN) is a class of ANN where connections between units form a directed cycle. RNNs make use of information from the past. That way, they can make predictions in data with high temporal dependencies. This creates an internal state of the network, which allows it to exhibit dynamic temporal behavior. In this article we will look at: Implementation of basic RNNs in TensorFlow. An example of how to implement an RNN in TensorFlow for spam predictions. Train a model that will learn to distinguish between spam and non-spam emails using the text of the email. This article is an extract taken from the book Deep Learning with TensorFlow – Second Edition, written by Giancarlo Zaccone, Md. Rezaul Karim. Implementing basic RNNs in TensorFlow TensorFlow has tf.contrib.rnn.BasicRNNCell and tf.nn.rnn_cell. BasicRNNCell, which provide the basic building blocks of RNNs. However, first let's implement a very simple RNN model, without using either of these. The idea is to have a better understanding of what goes on under the hood. We will create an RNN composed of a layer of five recurrent neurons using the ReLU activation function. We will assume that the RNN runs over only two-time steps, taking input vectors of size 3 at each time step. The following code builds this RNN, unrolled through two-time steps: n_inputs = 3 n_neurons = 5 X1 = tf.placeholder(tf.float32, [None, n_inputs]) X2 = tf.placeholder(tf.float32, [None, n_inputs]) Wx = tf.get_variable("Wx", shape=[n_inputs,n_neurons], dtype=tf. float32, initializer=None, regularizer=None, trainable=True, collections=None) Wy = tf.get_variable("Wy", shape=[n_neurons,n_neurons], dtype=tf. float32, initializer=None, regularizer=None, trainable=True, collections=None) b = tf.get_variable("b", shape=[1,n_neurons], dtype=tf.float32, initializer=None, regularizer=None, trainable=True, collections=None) Y1 = tf.nn.relu(tf.matmul(X1, Wx) + b) Y2 = tf.nn.relu(tf.matmul(Y1, Wy) + tf.matmul(X2, Wx) + b) Then we initialize the global variables as follows: init_op = tf.global_variables_initializer() This network looks much like a two-layer feedforward neural network, but both layers share the same weights and bias vectors. Additionally, we feed inputs at each layer and receive outputs from each layer. X1_batch = np.array([[0, 2, 3], [2, 8, 9], [5, 3, 8], [3, 2, 9]]) # t = 0 X2_batch = np.array([[5, 6, 8], [1, 0, 0], [8, 2, 0], [2, 3, 6]]) # t = 1 These mini-batches contain four instances, each with an input sequence composed of exactly two inputs. At the end, Y1_val and Y2_val contain the outputs of the network at both time steps for all neurons and all instances in the mini-batch. Then we create a TensorFlow session and execute the computational graph as follows: with tf.Session() as sess:        init_op.run()        Y1_val, Y2_val = sess.run([Y1, Y2], feed_dict={X1:        X1_batch, X2: X2_batch}) Finally, we print the result: print(Y1_val) # output at t = 0 print(Y2_val) # output at t = 1 The following is the output: >>> [[ 0. 0. 0. 2.56200171 1.20286 ] [ 0. 0. 0. 12.39334488 2.7824254 ] [ 0. 0. 0. 13.58520699 5.16213894] [ 0. 0. 0. 9.95982838 6.20652485]] [[ 0. 0. 0. 14.86255169 6.98305273] [ 0. 0. 26.35326385 0.66462421 18.31009483] [ 5.12617588 4.76199865 20.55905533 11.71787453 18.92538261] [ 0. 0. 19.75175095 3.38827515 15.98449326]] The network we created is simple, but if you run it over 100 time steps, for example, the graph is going to be very big. Implementing an RNN for spam prediction In this section, we will see how to implement an RNN in TensorFlow to predict spam/ham from texts. Data description and preprocessing The popular spam dataset from the UCI ML repository will be used, which can be downloaded from http://archive.ics.uci.edu/ml/machine-learning-databases/00228/smsspamcollection.zip. The dataset contains texts from several emails, some of which were marked as spam. Here we will train a model that will learn to distinguish between spam and non-spam emails using only the text of the email. Let's get started by importing the required libraries and model: import os import re import io import requests import numpy as np import matplotlib.pyplot as plt import tensorflow as tf from zipfile import ZipFile from tensorflow.python.framework import ops import warnings Additionally, we can stop printing the warning produced by TensorFlow if you want: warnings.filterwarnings("ignore") os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3' ops.reset_default_graph() Now, let's create the TensorFlow session for the graph: sess = tf.Session() The next task is setting the RNN parameters: epochs = 300 batch_size = 250 max_sequence_length = 25 rnn_size = 10 embedding_size = 50 min_word_frequency = 10 learning_rate = 0.0001 dropout_keep_prob = tf.placeholder(tf.float32) Let's manually download the dataset and store it in a text_data.txt file in the temp directory. First, we set the path: data_dir = 'temp' data_file = 'text_data.txt' if not os.path.exists(data_dir):    os.makedirs(data_dir) Now, we directly download the dataset in zipped format: if not os.path.isfile(os.path.join(data_dir, data_file)):    zip_url = 'http://archive.ics.uci.edu/ml/machine-learning- databases/00228/smsspamcollection.zip'    r = requests.get(zip_url)    z = ZipFile(io.BytesIO(r.content))    file = z.read('SMSSpamCollection') We still need to format the data: text_data = file.decode()    text_data = text_data.encode('ascii',errors='ignore')    text_data = text_data.decode().split('\n') Now, store in it the directory mentioned earlier in a text file: with open(os.path.join(data_dir, data_file), 'w') as file_conn:        for text in text_data:            file_conn.write("{}\n".format(text)) else:    text_data = []    with open(os.path.join(data_dir, data_file), 'r') as file_conn:        for row in file_conn:            text_data.append(row)    text_data = text_data[:-1] Let's split the words that have a word length of at least 2: text_data = [x.split('\t') for x in text_data if len(x)>=1] [text_data_target, text_data_train] = [list(x) for x in zip(*text_data)] Now we create a text cleaning function: def clean_text(text_string):    text_string = re.sub(r'([^\s\w]|_|[0-9])+', '', text_string)    text_string = " ".join(text_string.split())    text_string = text_string.lower()    return(text_string) We call the preceding method to clean the text: text_data_train = [clean_text(x) for x in text_data_train] Now we need to do one of the most important tasks, which is creating word embedding –changing text into numeric vectors: vocab_processor = tf.contrib.learn.preprocessing.VocabularyProcessor(max_sequence_length, min_frequency=min_word_frequency) text_processed = np.array(list(vocab_processor.fit_transform(text_data_train))) Now let's shuffle to make the dataset balance: text_processed = np.array(text_processed) text_data_target = np.array([1 if x=='ham' else 0 for x in text_data_target]) shuffled_ix = np.random.permutation(np.arange(len(text_data_target))) x_shuffled = text_processed[shuffled_ix] y_shuffled = text_data_target[shuffled_ix] Now that we have shuffled the data, we can split the data into a training and testing set: ix_cutoff = int(len(y_shuffled)*0.75) x_train, x_test = x_shuffled[:ix_cutoff], x_shuffled[ix_cutoff:] y_train, y_test = y_shuffled[:ix_cutoff], y_shuffled[ix_cutoff:] vocab_size = len(vocab_processor.vocabulary_) print("Vocabulary size: {:d}".format(vocab_size)) print("Training set size: {:d}".format(len(y_train))) print("Test set size: {:d}".format(len(y_test))) Following is the output of the preceding code: >>> Vocabulary size: 933 Training set size: 4180 Test set size: 1394 Before we start training, let's create placeholders for our TensorFlow graph: x_data = tf.placeholder(tf.int32, [None, max_sequence_length]) y_output = tf.placeholder(tf.int32, [None]) Let's create the embedding: embedding_mat = tf.get_variable("embedding_mat", shape=[vocab_size, embedding_size], dtype=tf.float32, initializer=None, regularizer=None, trainable=True, collections=None) embedding_output = tf.nn.embedding_lookup(embedding_mat, x_data) Now it's time to construct our RNN. The following code defines the RNN cell: cell = tf.nn.rnn_cell.BasicRNNCell(num_units = rnn_size) output, state = tf.nn.dynamic_rnn(cell, embedding_output, dtype=tf.float32) output = tf.nn.dropout(output, dropout_keep_prob) Now let's define the way to get the output from our RNN sequence: output = tf.transpose(output, [1, 0, 2]) last = tf.gather(output, int(output.get_shape()[0]) - 1) Next, we define the weights and the biases for the RNN: weight = bias = tf.get_variable("weight", shape=[rnn_size, 2], dtype=tf.float32, initializer=None, regularizer=None, trainable=True, collections=None) bias = tf.get_variable("bias", shape=[2], dtype=tf.float32, initializer=None, regularizer=None, trainable=True, collections=None) The logits output is then defined. It uses both the weight and the bias from the preceding code: logits_out = tf.nn.softmax(tf.matmul(last, weight) + bias) Now we define the losses for each prediction so that later on, they can contribute to the loss function: losses = tf.nn.sparse_softmax_cross_entropy_with_logits_v2(logits=logits_ou t, labels=y_output) We then define the loss function: loss = tf.reduce_mean(losses) We now define the accuracy of each prediction: accuracy = tf.reduce_mean(tf.cast(tf.equal(tf.argmax(logits_out, 1), tf.cast(y_output, tf.int64)), tf.float32)) We then create the training_op with RMSPropOptimizer: optimizer = tf.train.RMSPropOptimizer(learning_rate) train_step = optimizer.minimize(loss) Now let's initialize all the variables using the global_variables_initializer() method: init_op = tf.global_variables_initializer() sess.run(init_op) Additionally, we can create some empty lists to keep track of the training loss, testing loss, training accuracy, and the testing accuracy in each epoch: train_loss = [] test_loss = [] train_accuracy = [] test_accuracy = [] Now we are ready to perform the training, so let's get started. The workflow of the training goes as follows: Shuffle the training data Select the training set and calculate generations Run training step for each batch Run loss and accuracy of training Run the evaluation steps. The following codes include all of the aforementioned steps: shuffled_ix = np.random.permutation(np.arange(len(x_train)))    x_train = x_train[shuffled_ix]    y_train = y_train[shuffled_ix]    num_batches = int(len(x_train)/batch_size) + 1    for i in range(num_batches):        min_ix = i * batch_size        max_ix = np.min([len(x_train), ((i+1) * batch_size)])        x_train_batch = x_train[min_ix:max_ix]        y_train_batch = y_train[min_ix:max_ix]        train_dict = {x_data: x_train_batch, y_output: \ y_train_batch, dropout_keep_prob:0.5}        sess.run(train_step, feed_dict=train_dict)        temp_train_loss, temp_train_acc = sess.run([loss,\                         accuracy], feed_dict=train_dict)    train_loss.append(temp_train_loss)    train_accuracy.append(temp_train_acc)    test_dict = {x_data: x_test, y_output: y_test, \ dropout_keep_prob:1.0}    temp_test_loss, temp_test_acc = sess.run([loss, accuracy], \                    feed_dict=test_dict)    test_loss.append(temp_test_loss)    test_accuracy.append(temp_test_acc)    print('Epoch: {}, Test Loss: {:.2}, Test Acc: {:.2}'.format(epoch+1, temp_test_loss, temp_test_acc)) print('\nOverall accuracy on test set (%): {}'.format(np.mean(temp_test_acc)*100.0)) Following is the output of the preceding code: >>> Epoch: 1, Test Loss: 0.68, Test Acc: 0.82 Epoch: 2, Test Loss: 0.68, Test Acc: 0.82 Epoch: 3, Test Loss: 0.67, Test Acc: 0.82 … Epoch: 997, Test Loss: 0.36, Test Acc: 0.96 Epoch: 998, Test Loss: 0.36, Test Acc: 0.96 Epoch: 999, Test Loss: 0.35, Test Acc: 0.96 Epoch: 1000, Test Loss: 0.35, Test Acc: 0.96 Overall accuracy on test set (%): 96.19799256324768 Well done! The accuracy of the RNN is above 96%, which is outstanding. Now let's observe how the loss propagates across each iteration and over time: epoch_seq = np.arange(1, epochs+1) plt.plot(epoch_seq, train_loss, 'k--', label='Train Set') plt.plot(epoch_seq, test_loss, 'r-', label='Test Set') plt.title('RNN training/test loss') plt.xlabel('Epochs') plt.ylabel('Loss') plt.legend(loc='upper left') plt.show() Figure 1: a) RNN training and test loss per epoch b) test accuracy per epoch We also plot the accuracy over time: plt.plot(epoch_seq, train_accuracy, 'k--', label='Train Set') plt.plot(epoch_seq, test_accuracy, 'r-', label='Test Set') plt.title('Test accuracy') plt.xlabel('Epochs') plt.ylabel('Accuracy') plt.legend(loc='upper left') plt.show() We discussed the implementation of RNNs in TensorFlow. We saw how to make predictions with data that has a high temporal dependency and how to develop real-life predictive models that make the predictive analytics easier using RNNs. If you want to delve into neural networks and implement deep learning algorithms check out this book, Deep learning with TensorFlow - Second Edition. Top 5 Deep Learning Architectures Understanding Sentiment Analysis and other key NLP concepts Facelifting NLP with Deep Learning
Read more
  • 0
  • 0
  • 5224

article-image-four-ibm-facial-recognition-patents-in-2018-we-found-intriguing
Natasha Mathur
11 Aug 2018
10 min read
Save for later

Four IBM facial recognition patents in 2018, we found intriguing

Natasha Mathur
11 Aug 2018
10 min read
The media has gone into a frenzy over Google’s latest facial recognition patent that shows an algorithm can track you across social media and gather your personal details. We thought, we’d dive further into what other patents Google has applied for in facial recognition tehnology in 2018. What we discovered was an eye opener (pun intended). Google is only the 3rd largest applicant with IBM and Samsung leading the patents race in facial recognition. As of 10th Aug, 2018, 1292 patents have been granted in 2018 on Facial recognition. Of those, IBM received 53. Here is the summary comparison of leading companies in facial recognition patents in 2018. Read Also: Top four Amazon patents in 2018 that use machine learning, AR, and robotics IBM has always been at the forefront of innovation. Let’s go back about a quarter of a century, when IBM invented its first general-purpose computer for business. It built complex software programs that helped in launching Apollo missions, putting the first man on the moon. It’s chess playing computer, Deep Blue, back in 1997,  beat Garry Kasparov, in a traditional chess match (the first time a computer beat a world champion). Its researchers are known for winning Nobel Prizes. Coming back to 2018, IBM unveiled the world’s fastest supercomputer with AI capabilities, and beat the Wall Street expectations by making $20 billion in revenue in Q3 2018 last month, with market capitalization worth $132.14 billion as of August 9, 2018. Its patents are a major part of why it continues to be valuable highly. IBM continues to come up with cutting-edge innovations and to protect these proprietary inventions, it applies for patent grants. United States is the largest consumer market in the world, so patenting the technologies that the companies come out with is a standard way to attain competitive advantage. As per the United States Patent and Trademark Office (USPTO), Patent is an exclusive right to invention and “the right to exclude others from making, using, offering for sale, or selling the invention in the United States or “importing” the invention into the United States”. As always, IBM has applied for patents for a wide spectrum of technologies this year from Artificial Intelligence, Cloud, Blockchain, Cybersecurity, to Quantum Computing. Today we focus on IBM’s patents in facial recognition field in 2018. Four IBM facial recognition innovations patented in 2018 Facial recognition is a technology which identifies and verifies a person from a digital image or a video frame from a video source and IBM seems quite invested in it. Controlling privacy in a face recognition application Date of patent: January 2, 2018 Filed: December 15, 2015 Features: IBM has patented for a face-recognition application titled “Controlling privacy in a face recognition application”. Face recognition technologies can be used on mobile phones and wearable devices which may hamper the user privacy. This happens when a "sensor" mobile user identifies a "target" mobile user without his or her consent. The present mobile device manufacturers don’t provide the privacy mechanisms for addressing this issue. This is the major reason why IBM has patented this technology. Editor’s Note: This looks like an answer to the concerns raised over Google’s recent social media profiling facial recognition patent.   How it works? Controlling privacy in a face recognition application It consists of a privacy control system, which is implemented using a cloud computing node. The system uses a camera to find out information about the people, by using a face recognition service deployed in the cloud. As per the patent application “the face recognition service may have access to a face database, privacy database, and a profile database”. Controlling privacy in a face recognition application The facial database consists of one or more facial signatures of one or more users. The privacy database includes privacy preferences of target users. Privacy preferences will be provided by the target user and stored in the privacy database.The profile database contains information about the target user such as name, age, gender, and location. It works by receiving an input which includes a face recognition query and a digital image of a face. The privacy control system then detects a facial signature from the digital image. The target user associated with the facial signature is identified, and profile of the target user is extracted. It then checks the privacy preferences of the user. If there are no privacy preferences set, then it transmits the profile to the sensor user. But, if there are privacy preferences then the censored profile of the user is generated omitting out the private elements in the profile. There are no announcements, as for now, regarding when this technology will hit the market. Evaluating an impact of a user's content utilized in a social network Date of patent: January 30, 2018 Filed: April 11, 2015 Features:  IBM has patented for an application titled “Evaluating an impact of a user's content utilized in a social network”.  With so much data floating around on social network websites, it is quite common for the content of a document (e.g., e-mail message, a post, a word processing document, a presentation) to be reused, without the knowledge of an original author. Evaluating an impact of a user's content utilised in a social network Evaluating an impact of a user's content utilized in a social network Because of this, the original author of the content may not receive any credit, which creates less motivation for the users to post their original content in a social network. This is why IBM has decided to patent for this application. Evaluating an impact of a user's content utilized in a social network As per the patent application, the method/system/product  “comprises detecting content in a document posted on a social network environment being reused by a second user. The method further comprises identifying an author of the content. The method additionally comprises incrementing a first counter keeping track of a number of times the content has been adopted in derivative works”. There’s a processor, which generates an “impact score” which  represents the author's ability to influence other users to adopt the content. This is based on the number of times the content has been adopted in the derivative works. Also, “the method comprises providing social credit to the author of the content using the impact score”. Editor’s Note: This is particularly interesting to us as IBM, unlike other tech giants, doesn’t own a popular social network or media product. (Google has Google+, Microsoft has LinkedIn, Facebook and Twitter are social, even Amazon has stakes in a media entity in the form of Washington Post). No information is present about when or if this system will be used among social network sites. Spoof detection for facial recognition Date of patent: February 20, 2018 Filed: December 10, 2015 Features: IBM patented an application named “Spoof detection for facial recognition”.  It provides a method to determine whether the image is authentic or not. As per the patent “A facial recognition system is a computer application for automatically identifying or verifying a person from a digital image or a video frame from a video source.” Editor’s Note: This seems to have a direct impact on the work around tackling deepFakes, which incidentally is something DARPA is very keen on. Could IBM be vying for a long term contract with the government? How it works? The patent consists of a system that helps detect “if a face in a facial recognition authentication system is a three-dimensional structure based on multiple selected images from the input video”.                                      Spoof detection for facial recognition There are four or more two-dimensional feature points which are located via an image processing device connected to the camera. Here the two-dimensional feature points do not lie on the same two-dimensional plane. The patent reads that “one or more additional images of the user's face can be received with the camera; and, the at least four two-dimensional feature points can be located on each additional image with the image processor. The image processor can identify displacements between the two-dimensional feature points on the additional image and the two-dimensional feature points on the first image for each additional image” Spoof detection for facial recognition There is also a processor connected to the image processing device that helps figure out whether the displacements conform to a three-dimensional surface model. The processor can then determine whether to authenticate the user depending on whether the displacements conform to the three-dimensional surface model. Facial feature location using symmetry line Date of patent: June 5, 2018 Filed: July 20, 2015 Features: IBM patented for an application titled “Facial feature location using symmetry line”. As per the patent, “In many image processing applications, identifying facial features of the subject may be desired. Currently, location of facial features require a search in four dimensions using local templates that match the target features. Such a search tends to be complex and prone to errors because it has to locate both (x, y) coordinates, scale parameter and rotation parameter”. Facial feature location using symmetry line Facial feature location using symmetry line The application consists of a computer-implemented method that obtains an image of the subject’s face. After that it automatically detects a symmetry line of the face in the image, where the symmetry line intersects at least a mouth region of the face. It then automatically locates a facial feature of the face using the symmetry line. There’s also a computerised apparatus with a processor which performs the steps of obtaining an image of a subject’s face and helps locate the facial feature.  Editor’s note: Atleast, this patent makes direct sense to us. IBM is majorly focusing on bring AI to healthcare. A patent like this can find a lot of use in not just diagnostics and patient care, but also in cutting edge areas like robotics enabled surgeries. IBM is continually working on new technologies to provide the world with groundbreaking innovations. Its big investments in facial recognition technology speaks volumes about how IBM is well-versed with its endless possibilities. With the facial recognition technological progress,  come the privacy fears. But, IBM’s facial recognition application patent has got it covered as it lets the users set privacy preferences. This can be a great benchmark for IBM as no many existing applications are currently doing it. The social credit score evaluating app can really help bring the voice back to the users interested in posting content on social media platforms. The spoof detection application will help maintain authenticity by detecting forged images. Lastly, the facial feature detection can act as a great additional feature for image processing applications. IBM has been heavily investing in facial recognition technology. There are no guarantees by IBM as to whether these patents will ever make it to practical applications, but it does say a lot about how IBM thinks about the technology. Four interesting Amazon patents in 2018 that use machine learning, AR, and robotics Facebook patents its news feed filter tool to provide more relevant news to its users Google’s new facial recognition patent uses your social network to identify you!  
Read more
  • 0
  • 0
  • 5814

article-image-time-series-modeling-what-is-it-why-it-matters-how-its-used
Sunith Shetty
10 Aug 2018
11 min read
Save for later

Time series modeling: What is it, Why it matters and How it's used

Sunith Shetty
10 Aug 2018
11 min read
A series can be defined as a number of events, objects, or people of a similar or related kind coming one after another; if we add the dimension of time, we get a time series. A time series can be defined as a series of data points in time order. In this article, we will understand what time series is and why it is one of the essential characteristics for forecasting. This article is an excerpt from a book written by Harish Gulati titled SAS for Finance. The importance of time series What importance, if any, does time series have and how will it be relevant in the future? These are just a couple of fundamental questions that any user should find answers to before delving further into the subject. Let's try to answer this by posing a question. Have you heard the terms big data, artificial intelligence (AI), and machine learning (ML)? These three terms make learning time series analysis relevant. Big data is primarily about a large amount of data that may be analyzed computationally to reveal patterns, trends, and associations, especially relating to human behavior and interaction. AI is a kind of technology that is being developed by data scientists, computational experts, and others to enable processes to become more intelligent, while ML is an enabler that is helping to implement AI. All three of these terms are interlinked with the data they use, and a lot of this data is time series in its nature. This could be either financial transaction data, the behavior pattern of individuals during various parts of the day, or related to life events that we might experience. An effective mechanism that enables us to capture the data, store it, analyze it, and then build algorithms to predict transactions, behavior (and life events, in this instance) will depend on how big data is utilized and how AI and MI are leveraged. A common perception in the industry is that time series data is used for forecasting only. In practice, time series data is used for: Pattern recognition Forecasting Benchmarking Evaluating the influence of a single factor on the time series Quality control For example, a retailer may identify a pattern in clothing sales every time it gets a celebrity endorsement, or an analyst may decide to use car sales volume data from 2012 to 2017 to set a selling benchmark in units. An analyst might also build a model to quantify the effect of Lehman's crash at the height of the 2008 financial crisis in pushing up the price of gold. Variance in the success of treatments across time periods can also be used to highlight a problem, the tracking of which may enable a hospital to take remedial measures. These are just some of the examples that showcase how time series analysis isn't limited to just forecasting. In this chapter, we will review how the financial industry and others use forecasting, discuss what a good and a bad forecast is, and hope to understand the characteristics of time series data and its associated problems. Forecasting across industries Since one of the primary uses of time series data is forecasting, it's wise that we learn about some of its fundamental properties. To understand what the industry means by forecasting and the steps involved, let's visit a common misconception about the financial industry: only lending activities require forecasting. We need forecasting in order to grant personal loans, mortgages, overdrafts, or simply assess someone's eligibility for a credit card, as the industry uses forecasting to assess a borrower's affordability and their willingness to repay the debt. Even deposit products such as savings accounts, fixed-term savings, and bonds are priced based on some forecasts. How we forecast and the rationale for that methodology is different in borrowing or lending cases, however. All of these areas are related to time series, as we inevitably end up using time series data as part of the overall analysis that drives financial decisions. Let's understand the forecasts involved here a bit better. When we are assessing an individual's lending needs and limits, we are forecasting for a single person yet comparing the individual to a pool of good and bad customers who have been offered similar products. We are also assessing the individual's financial circumstances and behavior through industry-available scoring models or by assessing their past behavior, with the financial provider assessing the lending criteria. In the case of deposit products, as long as the customer is eligible to transact (can open an account and has passed know your customer (KYC), anti-money laundering (AML), and other checks), financial institutions don't perform forecasting at an individual level. However, the behavior of a particular customer is primarily driven by the interest rate offered by the financial institution. The interest rate, in turn, is driven by the forecasts the financial institution has done to assess its overall treasury position. The treasury is the department that manages the central bank's money and has the responsibility of ensuring that all departments are funded, which is generated through lending and attracting deposits at a lower rate than a bank lends. The treasury forecasts its requirements for lending and deposits, while various teams within the treasury adhere to those limits. Therefore, a pricing manager for a deposit product will price the product in such a way that the product will attract enough deposits to meet the forecasted targets shared by the treasury; the pricing manager also has to ensure that those targets aren't overshot by a significant margin, as the treasury only expects to manage a forecasted target. In both lending and deposit decisions, financial institutions do tend to use forecasting. A lot of these forecasts are interlinked, as we saw in the example of the treasury's expectations and the subsequent pricing decision for a deposit product. To decide on its future lending and borrowing positions, the treasury must have used time series data to determine what the potential business appetite for lending and borrowing in the market is and would have assessed that with the current cash flow situation within the relevant teams and institutions. Characteristics of time series data Any time series analysis has to take into account the following factors: Seasonality Trend Outliers and rare events Disruptions and step changes Seasonality Seasonality is a phenomenon that occurs each calendar year. The same behavior can be observed each year. A good forecasting model will be able to incorporate the effect of seasonality in its forecasts. Christmas is a great example of seasonality, where retailers have come to expect higher sales over the festive period. Seasonality can extend into months but is usually only observed over days or weeks. When looking at time series where the periodicity is hours, you may find a seasonality effect for certain hours of the day. Some of the reasons for seasonality include holidays, climate, and changes in social habits. For example, travel companies usually run far fewer services on Christmas Day, citing a lack of demand. During most holidays people love to travel, but this lack of demand on Christmas Day could be attributed to social habits, where people tend to stay at home or have already traveled. Social habit becomes a driving factor in the seasonality of journeys undertaken on Christmas Day. It's easier for the forecaster when a particular seasonal event occurs on a fixed calendar date each year; the issue comes when some popular holidays depend on lunar movements, such as Easter, Diwali, and Eid. These holidays may occur in different weeks or months over the years, which will shift the seasonality effect. Also, if some holidays fall closer to other holiday periods, it may lead to individuals taking extended holidays and travel sales may increase more than expected in such years. The coffee shop near the office may also experience lower sales for a longer period. Changes in the weather can also impact seasonality; for example, a longer, warmer summer may be welcome in the UK, but this would impact retail sales in the autumn as most shoppers wouldn't need to buy a new wardrobe. In hotter countries, sales of air-conditioners would increase substantially compared to the summer months' usual seasonality. Forecasters could offset this unpredictability in seasonality by building in a weather forecast variable. We will explore similar challenges in the chapters ahead. Seasonality shouldn't be confused with a cyclic effect. A cyclic effect is observed over a longer period of generally two years or more. The property sector is often associated with having a cyclic effect, where it has long periods of growth or slowdown before the cycle continues. Trend A trend is merely a long-term direction of observed behavior that is found by plotting data against a time component. A trend may indicate an increase or decrease in behavior. Trends may not even be linear, but a broad movement can be identified by analyzing plotted data. Outliers and rare events Outliers and rare events are terminologies that are often used interchangeably by businesses. These concepts can have a big impact on data, and some sort of outlier treatment is usually applied to data before it is used for modeling. It is almost impossible to predict an outlier or rare event but they do affect a trend. An example of an outlier could be a customer walking into a branch to deposit an amount that is 100 times the daily average of that branch. In this case, the forecaster wouldn't expect that trend to continue. Disruptions Disruptions and step changes are becoming more common in time series data. One reason for this is the abundance of available data and the growing ability to store and analyze it. Disruptions could include instances when a business hasn't been able to trade as normal. Flooding at the local pub may lead to reduced sales for a few days, for example. While analyzing daily sales across a pub chain, an analyst may have to make note of a disruptive event and its impact on the chain's revenue. Step changes are also more common now due to technological shifts, mergers and acquisitions, and business process re-engineering. When two companies announce a merger, they often try to sync their data. They might have been selling x and y quantities individually, but after the merger will expect to sell x + y + c (where c is the positive or negative effect of the merger). Over time, when someone plots sales data in this case, they will probably spot a step change in sales that happened around the time of the merger, as shown in the following screenshot: In the trend graph, we can see that online travel bookings are increasing. In the step change and disruptions chart, we can see that Q1 of 2012 saw a substantive increase in bookings, where Q1 of 2014 saw a substantive dip. The increase was due to the merger of two companies that took place in Q1 of 2012. The decrease in Q1 of 2014 was attributed to prolonged snow storms in Europe and the ash cloud disruption from volcanic activity over Iceland. While online bookings kept increasing after the step change, the disruption caused by the snow storm and ash cloud only had an effect on sales in Q1 of 2014. In this case, the modeler will have to treat the merger and the disruption differently while using them in the forecast, as disruption could be disregarded as an outlier and treated accordingly. Also note that the seasonality chart shows that Q4 of each year sees almost a 20% increase in travel bookings, and this pattern continues each calendar year. In this article, we defined time series and learned why it is important for forecasting. We also looked at the characteristics of time series data. To know more how to leverage the analytical power of SAS to perform financial analysis efficiently, you can check out the book SAS for Finance. Read more Getting to know SQL Server options for disaster recovery Implementing a simple Time Series Data Analysis in R Training RNNs for Time Series Forecasting
Read more
  • 0
  • 0
  • 7247
article-image-send-email-notifications-using-sendgrid
Packt Editorial Staff
10 Aug 2018
6 min read
Save for later

How to send email Notifications using SendGrid

Packt Editorial Staff
10 Aug 2018
6 min read
SendGrid is one of the popular services that allow the audience to send emails for different purposes. In today’s tutorial we will explore to: Create SendGrid account Generate SendGrid API Key Configure SendGrid API key with Azure function app Send an email notification to the website administrator Here, we will learn how to create a SendGrid output binding and send an email notification to the administrator with a static content. In general there would be only administrators so we will be hard coding the email address of the administrator in the To address field of the SendGrid output binding Getting ready Create a SendGrid account API Key from the Azure Management Portal. Generate an API Key from the SendGrid Portal. Create a SendGrid account Navigate to Azure Management Portal and create a SendGrid Email Delivery account by searching for the same in the Marketplace shown as follows: In the SendGrid Email Delivery blade, click on Create button to navigate to the Create a new SendGrid Account. Please select Free tier in the Pricing tier and provide all other details and click on the Create button shown as follows: Once the account is created successfully, navigate to the SendGrid account. You can use the search box available in the top which is shown as follows: Navigate to the Settings, choose configurations and grab the username and SmtpServer from the Configurations blade. Generate SendGrid API key In order to utilize SendGrid account by the Azure Functions runtime, we need to provide the SendGrid API key as input to the Azure Functions. You can generate an API Key from the SendGrid portal. Let's navigate to the SendGrid portal by clicking on the Manage button in the Essentials blade of the SendGrid account shown as follows: In the SendGrid portal, click on the API Keys under Settings section of the Left hand side menu shown as follows: In the API Keys page, click on Create API Key shown as follows: In the Create API Key popup, provide a name and choose the API Key Permissions and click on Create & View button. After a moment you will be able to see the API key. Click on the key to copy the same to the clipboard: Configure SendGrid API key with Azure Function app Create a new app setting in the Azure Function app by navigating to the Application Settings blade under the Platform features section of the function app shown as follows: Click on Save button after adding the app settings in the preceding step. How to do it... Navigate to the Integrate tab of the RegisterUser function and click on New Output button to add a new output binding. Choose the SendGrid output binding and click on Select button to add the binding. Please provide the following parameters in the SendGrid output binding: Message parameter name - leave the default value - message. We will be using this parameter in the run method in a moment. SendGrid API key: Please provide the app settings key that you have created in the application settings. To address: Please provide the email address of the administrator. From address: Please provide the email address from where you would like to send the email. In general, it would be kind of [email protected]. Message subject: Please provide the subject that you would like to have in the email subject. Message Text: Please provide the email body text that you would like to have in the email body. Below is how the SendGrid output binding should look like after providing all the fields: Once you review the values, click on Save to save the changes. Navigate to Run method and make the following changes: Add a new reference for SendGrid and also the namespace Add a new out parameter message of type Mail. Create an object of type Mail. Following is the complete code of the Run method: #r  "Microsoft.WindowsAzure.Storage" #r "SendGrid" using  System.Net; using SendGrid.Helpers.Mail; using  Microsoft.WindowsAzure.Storage.Table; using  Newtonsoft.Json; public  static  void  Run(HttpRequestMessage  req, TraceWriter  log, CloudTable  objUserProfileTable, out  string  objUserProfileQueueItem, out Mail message ) { var  inputs  =  req.Content.ReadAsStringAsync().Result; dynamic  inputJson  =  JsonConvert.DeserializeObject<dynamic>(inputs); string  firstname=  inputJson.firstname; string  lastname=inputJson.lastname; string  profilePicUrl  =  inputJson.ProfilePicUrl; objUserProfileQueueItem  =  profilePicUrl; UserProfile  objUserProfile  =  new  UserProfile(firstname,  lastname); TableOperation  objTblOperationInsert  = TableOperation.Insert(objUserProfile); objUserProfileTable.Execute(objTblOperationInsert); message = new Mail(); } public  class  UserProfile  :  TableEntity { public  UserProfile(string  lastName,  string  firstname,string profilePicUrl) { this.PartitionKey  =  "p1"; this.RowKey  =  Guid.NewGuid().ToString();; this.FirstName  =  firstName; this.LastName  =  lastName; this.ProfilePicUrl  =  profilePicUrl; } public  UserProfile()  {  } public  string  FirstName  {  get;  set;  } public  string  LastName  {  get;  set;  } public  string  ProfilePicUrl  {get;  set;} } Now, let's test the functionality of sending the email by navigating to the RegisterUser function and submit a request with the some test values: { "firstname": "Bill", "lastname": "Gates", "ProfilePicUrl":"https://upload.wikimedia.org/wikipedia/commons/thumb/1/19/ Bill_Gates_June_2015.jpg/220px-Bill_Gates_June_2015.jpg" } How it works... The aim here is to send a notification via email to an administrator updating that a new registration got created successfully. We have used the one of the Azure Function experimental templates named SendGrid as a SMTP server for sending the emails by hard coding the following properties in the SendGrid output bindings: From email address To email address Subject of the email Body of the email SendGrid output bindings will use the API key provided in the app settings to invoke the required APIs of the SendGrid library for sending the emails. To summarize, we learnt about sending an email notification using SendGrid service. [box type="shadow" align="" class="" width=""]This article is an excerpt from the book, Azure Serverless Computing Cookbook, written by Praveen Kumar Sriram. It contains over 50 recipes to help you build applications hosted on Serverless architecture using Azure Functions.[/box] 5 reasons why your business should adopt cloud computing Alibaba Cloud partners with SAP to provide a versatile, one-stop cloud computing environment Top 10 IT certifications for cloud and networking professionals in 2018    
Read more
  • 0
  • 0
  • 7541

article-image-visualizing-data-r-and-python-using-anaconda
Natasha Mathur
09 Aug 2018
7 min read
Save for later

Visualizing data in R and Python using Anaconda [Tutorial]

Natasha Mathur
09 Aug 2018
7 min read
It is said that a picture is worth a thousand words. Through various pictures and graphical presentations, we can express many abstract concepts, theories, data patterns, or certain ideas much clearer. Data can be messy at times, and simply showing the data points would confuse audiences further. If we could have a simple graph to show its main characteristics, properties, or patterns, it would help greatly. In this tutorial, we explain why we should care about data visualization and then we will discuss techniques used for data visualization in R and Python. This article is an excerpt from a book 'Hands-On Data Science with Anaconda' written by Dr. Yuxing Yan, James Yan. Data visualization in R Firstly, let's see the simplest graph for R. With the following one-line R code, we draw a cosine function from -2π to 2π: > plot(cos,-2*pi,2*pi) The related graph is shown here: Histograms could also help us understand the distribution of data points. The previous graph is a simple example of this. First, we generate a set of random numbers drawn from a standard normal distribution. For the purposes of illustration, the first line of set.seed() is actually redundant. Its existence would guarantee that all users would get the same set of random numbers if the same seed was used ( 333 in this case). In other words, with the same set of input values, our histogram would look the same. In the next line, the rnorm(n) function draws n random numbers from a standard normal distribution. The last line then has the hist() function to generate a histogram: > set.seed(333) > data<-rnorm(5000) > hist(data) The associated histogram is shown here: Note that the code of rnorm(5000) is the same as rnorm(5000,mean=0,sd=1), which implies that the default value of the mean is 0 and the default value for sd is 1. The next R program would shade the left-tail for a standard normal distribution: x<-seq(-3,3,length=100) y<-dnorm(x,mean=0,sd=1) title<-"Area under standard normal dist & x less than -2.33" yLabel<-"standard normal distribution" xLabel<-"x value" plot(x,y,type="l",lwd=3,col="black",main=title,xlab=xLabel,ylab=yLabel) x<-seq(-3,-2.33,length=100) y<-dnorm(x,mean=0,sd=1) polygon(c(-4,x,-2.33),c(0,y,0),col="red") The related graph is shown here: Note that according to the last line in the preceding graph, the shaded area is red. In terms of exploring the properties of various datasets, the R package called rattle is quite useful. If the rattle package is not preinstalled, we could run the following code to install it: > install.packages("rattle") Then, we run the following code to launch it; > library(rattle) > rattle() After hitting the Enter key, we can see the following: As our first step, we need to import certain datasets. For the sources of data, we choose from seven potential formats, such as File, ARFF, ODBC, R Dataset, and RData File, and we can load our data from there. The simplest way is using the Library option, which would list all the embedded datasets in the rattle package. After clicking Library, we can see a list of embedded datasets. Assume that we choose acme:boot:Monthly Excess Returns after clicking Execute in the top left. We would then see the following: Now, we can study the properties of the dataset. After clicking Explore, we can use various graphs to view our dataset. Assume that we choose Distribution and select the Benford check box. We can then refer to the following screenshot for more details: After clicking Execute, the following would pop up. The top red line shows the frequencies for the Benford Law for each digits of 1 to 9, while the blue line at the bottom shows the properties of our data set. Note that if you don't have the reshape package already installed in your system, then this either won't run or will ask for permission to install the package to your computer: The dramatic difference between those two lines indicates that our data does not follow a distribution suggested by the Benford Law. In our real world, we know that many people, events, and economic activities are interconnected, and it would be a great idea to use various graphs to show such a multi-node, interconnected picture. If the qgraph package is not preinstalled, users have to run the following to install it: > install.packages("qgraph") The next program shows the connection from a to b, a to c, and the like: library(qgraph) stocks<-c("IBM","MSFT","WMT") x<-rep(stocks, each = 3) y<-rep(stocks, 3) correlation<-c(0,10,3,10,0,3,3,3,0) data <- as.matrix(data.frame(from =x, to =y, width =correlation)) qgraph(data, mode = "direct", edge.color = rainbow(9)) If the data is shown, the meaning of the program will be much clearer. The correlation shows how strongly those stocks are connected. Note that all those values are randomly chosen with no real-world meanings: > data from to width [1,] "IBM" "IBM" " 0" [2,] "IBM" "MSFT" "10" [3,] "IBM" "WMT" " 3" [4,] "MSFT" "IBM" "10" [5,] "MSFT" "MSFT" " 0" [6,] "MSFT" "WMT" " 3" [7,] "WMT" "IBM" " 3" [8,] "WMT" "MSFT" " 3" [9,] "WMT" "WMT" " 0" A high value for the third variable suggests a stronger correlation. For example, IBM is more strongly correlated with MSFT, with a value of 10, than its correlation with WMT, with a value of 3. The following graph shows how strongly those three stocks are correlated: The following program shows the relationship or interconnection between five factors: library(qgraph) data(big5) data(big5groups) title("Correlations among 5 factors",line = 2.5) qgraph(cor(big5),minimum = 0.25,cut = 0.4,vsize = 1.5, groups = big5groups,legend = TRUE, borders = FALSE,theme = 'gray') The related graph is shown here: Data visualization in Python The most widely used Python package for graphs and images is called matplotlib. The following program can be viewed as the simplest Python program to generate a graph since it has just three lines: import matplotlib.pyplot as plt plt.plot([2,3,8,12]) plt.show() The first command line would upload a Python package called matplotlib.pyplot and rename it to plt. Note that we could even use other short names, but it is conventional to use plt for the matplotlib package. The second line plots four points, while the last one concludes the whole process. The completed graph is shown here: For the next example, we add labels for both x and y, and a title. The function is the cosine function with an input value varying from -2π to 2π: import scipy as sp import matplotlib.pyplot as plt x=sp.linspace(-2*sp.pi,2*sp.pi,200,endpoint=True) y=sp.cos(x) plt.plot(x,y) plt.xlabel("x-value") plt.ylabel("Cosine function") plt.title("Cosine curve from -2pi to 2pi") plt.show() The nice-looking cosine graph is shown here: If we received $100 today, it would be more valuable than what would be received in two years. This concept is called the time value of money, since we could deposit $100 today in a bank to earn interest. The following Python program uses size to illustrate this concept: import matplotlib.pyplot as plt fig = plt.figure(facecolor='white') dd = plt.axes(frameon=False) dd.set_frame_on(False) dd.get_xaxis().tick_bottom() dd.axes.get_yaxis().set_visible(False) x=range(0,11,2) x1=range(len(x),0,-1) y = [0]*len(x); plt.annotate("$100 received today",xy=(0,0),xytext=(2,0.15),arrowprops=dict(facecolor='black',shrink=2)) plt.annotate("$100 received in 2 years",xy=(2,0),xytext=(3.5,0.10),arrowprops=dict(facecolor='black',shrink=2)) s = [50*2.5**n for n in x1]; plt.title("Time value of money ") plt.xlabel("Time (number of years)") plt.scatter(x,y,s=s); plt.show() The associated graph is shown here. Again, the different sizes show their present values in relative terms: To summarize, we discussed ways data visualization works in Python and R.  Visual presentations can help our audience understand data better. If you found this post useful, check out the book 'Hands-On Data Science with Anaconda' to learn about different types of visual representation written in languages such as R, Python, Julia, etc. A tale of two tools: Tableau and Power BI Anaconda Enterprise version 5.1.1 released! 10 reasons why data scientists love Jupyter notebooks
Read more
  • 0
  • 0
  • 4797