Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds

How-To Tutorials

6719 Articles
article-image-configuring-and-deploying-hbase-tutorial
Natasha Mathur
02 Jul 2018
21 min read
Save for later

Configuring and deploying HBase [Tutorial]

Natasha Mathur
02 Jul 2018
21 min read
HBase is inspired by the Google big table architecture, and is fundamentally a non-relational, open source, and column-oriented distributed NoSQL. Written in Java, it is designed and developed by many engineers under the framework of Apache Software Foundation. Architecturally it sits on Apache Hadoop and runs by using Hadoop Distributed File System (HDFS) as its foundation. It is a column-oriented database, empowered by a fault-tolerant distributed file structure known as HDFS. In addition to this, it also provides very advanced features, such as auto sharding, load-balancing, in-memory caching, replication, compression, near real-time lookups, strong consistency (using multi-version). It uses the latest concepts of block cache and bloom filter to provide faster response to online/real-time request. It supports multiple clients running on heterogeneous platforms by providing user-friendly APIs. In this tutorial, we will discuss how to effectively set up mid and large size HBase cluster on top of Hadoop/HDFS framework. We will also help you set up HBase on a fully distributed cluster. For cluster setup, we will consider REH (RedHat Enterprise-6.2 Linux 64 bit); for the setup we will be using six nodes. This article is an excerpt taken from the book ‘HBase High Performance Cookbook’ written by Ruchir Choudhry. This book provides a solid understanding of the HBase basics. Let’s get started! Configuring and deploying Hbase Before we start HBase in fully distributed mode, we will be setting up first Hadoop-2.2.0 in a distributed mode, and then on top of Hadoop cluster we will set up HBase because HBase stores data in HDFS. Getting Ready The first step will be to create a directory at user/u/HBase B and download the tar file from the location given later. The location can be local, mount points or in cloud environments; it can be block storage: wget wget –b http://apache.mirrors.pair.com/hadoop/common/hadoop-2.2.0/hadoop-2.2.0.tar.gz This –b option will download the tar file as a background process. The output will be piped to wget-log. You can tail this log file using tail -200f wget-log. Untar it using the following commands: tar -xzvf hadoop-2.2.0.tar.gz This is used to untar the file in a folder hadoop-2.2.0 in your current diectory location. Once the untar process is done, for clarity it's recommended use two different folders one for NameNode and other for DataNode. I am assuming app is a user and app is a group on a Linux platform which has access to read/write/execute access to the locations, if not please create a user app and group app if you have sudo su - or root/admin access, in case you don't have please ask your administrator to create this user and group for you in all the nodes and directorates you will be accessing. To keep the NameNodeData and the DataNodeData for clarity let's create two folders by using the following command, inside /u/HBase B: Mkdir NameNodeData DataNodeData NameNodeData will have the data which is used by the name nodes and DataNodeData will have the data which will be used by the data nodes: ls –ltr will show the below results. drwxrwxr-x 2 app app  4096 Jun 19 22:22 NameNodeData drwxrwxr-x 2 app app  4096 Jun 19 22:22 DataNodeData -bash-4.1$ pwd /u/HBase B/hadoop-2.2.0 -bash-4.1$ ls -ltr total 60K drwxr-xr-x 2 app app 4.0K Mar 31 08:49 bin drwxrwxr-x 2 app app 4.0K Jun 19 22:22 DataNodeData drwxr-xr-x 3 app app 4.0K Mar 31 08:49 etc The steps in choosing Hadoop cluster are: Hardware details required for it Software required to do the setup OS required to do the setup Configuration steps HDFS core architecture is based on master/slave, where an HDFS cluster comprises of solo NameNode, which is essentially used as a master node, and owns the accountability for that orchestrating, handling the file system, namespace, and controling access to files by client. It performs this task by storing all the modifications to the underlying file system and propagates these changes as logs, appends to the native file system files, and edits. SecondaryNameNode is designed to merge the fsimage and the edits log files regularly and controls the size of edit logs to an acceptable limit. In a true cluster/distributed environment, it runs on a different machine. It works as a checkpoint in HDFS. We will require the following for the NameNode: Components Details Used for nodes/systems Operating System Redhat-6.2 Linux  x86_64 GNU/Linux, or other standard linux kernel. All the setup for Hadoop/HBase and other components used Hardware /CPUS 16 to 32 CPU cores NameNode/Secondary NameNode 2 quad-hex-/octo-core CPU DataNodes Hardware/RAM 128 to 256 GB, In special caes 128 GB to 512 GB RAM NameNode/Secondary NameNodes 128 GB -512 GB of RAM DataNodes Hardware/storage It's pivotal to have NameNode server on robust and reliable storage platform as it responsible for many key activities like edit-log journaling. As the importance of these machines are very high and the NameNodes plays a central role in orchestrating everything,thus RAID or any robust storage device is acceptable. NameNode/Secondary Namenodes 2 to 4 TB hard disk in a JBOD DataNodes RAID is nothing but a random access inexpensive drive or independent disk. There are many levels of RAID drives, but for master or a NameNode, RAID 1 will be enough. JBOD stands for Just a bunch of Disk. The design is to have multiple hard drives stacked over each other with no redundancy. The calling software needs to take care of the failure and redundancy. In essence, it works as a single logical volume: Before we start for the cluster setup, a quick recap of the Hadoop setup is essential with brief descriptions. How to do it Let's create a directory where you will have all the software components to be downloaded: For the simplicity, let's take it as /u/HBase B. Create different users for different purposes. The format will be as follows user/group, this is essentially required to differentiate different roles for specific purposes: Hdfs/hadoop is for handling Hadoop-related setup Yarn/hadoop is for yarn related setup HBase /hadoop Pig/hadoop Hive/hadoop Zookeeper/hadoop Hcat/hadoop Set up directories for Hadoop cluster. Let's assume /u as a shared mount point. We can create specific directories that will be used for specific purposes. Please make sure that you have adequate privileges on the folder to add, edit, and execute commands. Also, you must set up password less communication between different machines like from name node to the data node and from HBase master to all the region server nodes. Once the earlier-mentioned structure is created; we can download the tar files from the following locations: -bash-4.1$ ls -ltr total 32 drwxr-xr-x  9 app app 4096 hadoop-2.2.0 drwxr-xr-x 10 app app 4096 zookeeper-3.4.6 drwxr-xr-x 15 app app 4096 pig-0.12.1 drwxrwxr-x  7 app app 4096 HBase -0.98.3-hadoop2 drwxrwxr-x  8 app app 4096 apache-hive-0.13.1-bin drwxrwxr-x  7 app app 4096 Jun 30 01:04 mahout-distribution-0.9 You can download these tar files from the following location: wget –o https://archive.apache.org/dist/HBase /HBase -0.98.3/HBase -0.98.3-hadoop1-bin.tar.gz wget -o https://www.apache.org/dist/zookeeper/zookeeper-3.4.6/zookeeper-3.4.6.tar.gz wget –o https://archive.apache.org/dist/mahout/0.9/mahout-distribution-0.9.tar.gz wget –o https://archive.apache.org/dist/hive/hive-0.13.1/apache-hive-0.13.1-bin.tar.gz wget -o https://archive.apache.org/dist/pig/pig-0.12.1/pig-0.12.1.tar.gz Here, we will list the procedure to achieve the end result of the recipe. This section will follow a numbered bullet form. We do not need to give the reason that we are following a procedure. Numbered single sentences would do fine. Let's assume that there is a /u directory and you have downloaded the entire stack of software from: /u/HBase B/hadoop-2.2.0/etc/hadoop/ and look for the file core-site.xml. Place the following lines in this configuration file: <configuration> <property>    <name>fs.default.name</name>    <value>hdfs://addressofbsdnsofmynamenode-hadoop:9001</value> </property> </configuration> You can specify a port that you want to use, and it should not clash with the ports that are already in use by the system for various purposes. Save the file. This helps us create a master /NameNode. Now, let's move to set up SecondryNodes, let's edit /u/HBase B/hadoop-2.2.0/etc/hadoop/ and look for the file core-site.xml: <property>  <name>fs.defaultFS</name>  <value>hdfs://custome location of your hdfs</value> </property> <configuration> <property>           <name>fs.checkpoint.dir</name>           <value>/u/HBase B/dn001/hadoop/hdf/secdn        /u/HBase B/dn002/hadoop/hdfs/secdn </value>    </property> </configuration> The separation of the directory structure is for the purpose of a clean separation of the HDFS block separation and to keep the configurations as simple as possible. This also allows us to do a proper maintenance. Now, let's move towards changing the setup for hdfs; the file location will be /u/HBase B/hadoop-2.2.0/etc/hadoop/hdfs-site.xml. Add these properties in hdfs-site.xml: For NameNode: <property>          <name>dfs.name.dir</name>          <value> /u/HBase B/nn01/hadoop/hdfs/nn,/u/HBase B/nn02/hadoop/hdfs/nn </value>      </property> For DataNode: <property>          <name>dfs.data.dir</name>          <value> /u/HBase B/dnn01/hadoop/hdfs/dn,/HBase B/u/dnn02/hadoop/hdfs/dn </value> </property> Now, let's go for NameNode for http address or to access using http protocol: <property> <name>dfs.http.address</name> <value>yournamenode.full.hostname:50070</value> </property> <property> <name>dfs.secondary.http.address</name> <value> secondary.yournamenode.full.hostname:50090 </value>      </property> We can go for the https setup for the NameNode too, but let's keep it optional for now: Let's set up the yarn resource manager: Let's look for Yarn setup: /u/HBase B/hadoop-2.2.0/etc/hadoop/ yarn-site.xml For resource tracker a part of yarn resource manager: <property>  <name>yarn.yourresourcemanager.resourcetracker.address</name> <value>youryarnresourcemanager.full.hostname:8025</value> </property> For resource schedule part of yarn resource scheduler: <property> <name>yarn.yourresourcemanager.scheduler.address</name> <value>yourresourcemanager.full.hostname:8030</value> </property> For scheduler address: <property> <name>yarn.yourresourcemanager.address</name> <value>yourresourcemanager.full.hostname:8050</value> </property> For scheduler admin address: <property> <name>yarn.yourresourcemanager.admin.address</name> <value>yourresourcemanager.full.hostname:8041</value> </property> To set up a local dir: <property>         <name>yarn.yournodemanager.local-dirs</name>         <value>/u/HBase /dnn01/hadoop/hdfs /yarn,/u/HBase B/dnn02/hadoop/hdfs/yarn </value>    </property> To set up a log location: <property> <name> yarn.yournodemanager.logdirs </name>          <value>/u/HBase B/var/log/hadoop/yarn</value> </property> This completes the configuration changes required for Yarn. Now, let's make the changes for Map reduce: Let's open the mapred-site.xml: /u/HBase B/hadoop-2.2.0/etc/hadoop/mapred-site.xml Now, let's place this property configuration setup in the mapred-site.xml and place it between the following: <configuration > </configurations > <property><name>mapreduce.yourjobhistory.address</name> <value>yourjobhistoryserver.full.hostname:10020</value> </property> Once we have configured Map reduce job history details, we can move on to configure HBase . Let's go to this path /u/HBase B/HBase -0.98.3-hadoop2/conf and open HBase -site.xml. You will see a template having the following: <configuration > </configurations > We need to add the following lines between the starting and ending tags: <property> <name>HBase .rootdir</name> <value>hdfs://HBase .yournamenode.full.hostname:8020/apps/HBase /data </value> </property> <property> <name>HBase .yourmaster.info.bindAddress</name> <value>$HBase .yourmaster.full.hostname</value> </property> This competes the HBase changes. ZooKeeper: Now, let's focus on the setup of ZooKeeper. In distributed env, let's go to this location and rename the zoo_sample.cfg to zoo.cfg: /u/HBase B/zookeeper-3.4.6/conf Open zoo.cfg by vi zoo.cfg and place the details as follows; this will create two instances of zookeeper on different ports: yourzooKeeperserver.1=zoo1:2888:3888 yourZooKeeperserver.2=zoo2:2888:3888 If you want to test this setup locally, please use different port combinations. In a production-like setup as mentioned earlier, yourzooKeeperserver.1=zoo1:2888:3888 is server.id=host:port:port: yourzooKeeperserver.1= server.id zoo1=host 2888=port 3888=port Atomic broadcasting is an atomic messaging system that keeps all the servers in sync and provides reliable delivery, total order, casual order, and so on. Region servers: Before concluding it, let's go through the region server setup process. Go to this folder /u/HBase B/HBase -0.98.3-hadoop2/conf and edit the regionserver file. Specify the region servers accordingly: RegionServer1 RegionServer2 RegionServer3 RegionServer4 RegionServer1 equal to the IP or fully qualified CNAME of 1 Region server. You can have as many region servers (1. N=4 in our case), but its CNAME and mapping in the region server file need to be different. Copy all the configuration files of HBase and ZooKeeper to the relative host dedicated for HBase and ZooKeeper. As the setup is in a fully distributed cluster mode, we will be using a different host for HBase and its components and a dedicated host for ZooKeeper. Next, we validate the setup we've worked on by adding the following to the bashrc, this will make sure later we are able to configure the NameNode as expected: It preferred to use it in your profile, essentially /etc/profile; this will make sure the shell which is used is only impacted. Now let's format NameNode: Sudo su $HDFS_USER /u/HBase B/hadoop-2.2.0/bin/hadoop namenode -format HDFS is implemented on the existing local file system of your cluster. When you want to start the Hadoop setup first time you need to start with a clean slate and hence any existing data needs to be formatted and erased. Before formatting we need to take care of the following. Check whether there is a Hadoop cluster running and using the same HDFS; if it's done accidentally all the data will be lost. /u/HBase B/hadoop-2.2.0/sbin/hadoop-daemon.sh --config $HADOOP_CONF_DIR start namenode Now let's go to the SecondryNodes: Sudo su $HDFS_USER /u/HBase B/hadoop-2.2.0/sbin/hadoop-daemon.sh --config $HADOOP_CONF_DIR start secondarynamenode Repeating the same procedure in DataNode: Sudo su $HDFS_USER /u/HBase B/hadoop-2.2.0/sbin/hadoop-daemon.sh --config $HADOOP_CONF_DIR start datanode Test 01> See if you can reach from your browser http://namenode.full.hostname:50070: Test 02> sudo su $HDFS_USER touch /tmp/hello.txt Now, hello.txt file will be created in tmp location: /u/HBase B/hadoop-2.2.0/bin/hadoop dfs  -mkdir -p /app /u/HBase B/hadoop-2.2.0/bin/hadoop dfs  -mkdir -p /app/apphduser This will create a specific directory for this application user in the HDFS FileSystem location(/app/apphduser) /u/HBase B/hadoop-2.2.0/bin/hadoop dfs -copyFromLocal /tmp/hello.txt /app/apphduser /u/HBase B/hadoop-2.2.0/bin/hadoop dfs –ls /app/apphduser apphduser is a dirctory which is created in hdfs for a specific user. So that the data is sepreated based on the users, in a true production env many users will be using it. You can also use hdfs dfs –ls / commands if it shows hadoop command as depricated. You must see hello.txt once the command executes: Test 03> Browse http://datanode.full.hostname:50075/browseDirectory.jsp?namenodeInfoPort=50070&dir=/&nnaddr=$datanode.full.hostname:8020 It is important to change the data host name and other parameters accordingly. You should see the details on the DataNode. Once you hit the preceding URL you will get the following screenshot: On the command line it will be as follows: Validate Yarn/MapReduce setup and execute this command from the resource manager: <login as $YARN_USER> /u/HBase B/hadoop-2.2.0/sbin/yarn-daemon.sh --config $HADOOP_CONF_DIR start resourcemanager Execute the following command from NodeManager: <login as $YARN_USER > /u/HBase B/hadoop-2.2.0/sbin /yarn-daemon.sh --config $HADOOP_CONF_DIR start nodemanager Executing the following commands will create the directories in the hdfs and apply the respective access rights: Cd u/HBase B/hadoop-2.2.0/bin hadoop fs -mkdir /app-logs // creates the dir in HDFS hadoop fs -chown $YARN_USER /app-logs //changes the ownership hadoop fs -chmod 1777 /app-logs // explained in the note section Execute MapReduce Start jobhistory servers: <login as $MAPRED_USER> /u/HBase B/hadoop-2.2.0/sbin/mr-jobhistory-daemon.sh start historyserver --config $HADOOP_CONF_DIR Let's have a few tests to be sure we have configured properly: Test 01: From the browser or from curl use the link to browse: http://yourresourcemanager.full.hostname:8088/. Test 02: Sudo su $HDFS_USER /u/HBase B/hadoop-2.2.0/bin/hadoop jar /u/HBase B/hadoop-2.2.0/hadoop-mapreduce/hadoop-mapreduce-examples-2.0.2.1-alpha.jar teragen 100 /test/10gsort/input /u/HBase B/hadoop-2.2.0/bin/hadoop jar /u/HBase B/hadoop-2.2.0/hadoop-mapreduce/hadoop-mapreduce-examples-2.0.2.1-alpha.jar Validate the HBase setup: Login as $HDFS_USER /u/HBase B/hadoop-2.2.0/bin/hadoop fs –mkdir -p /apps/HBase /u/HBase B/hadoop-2.2.0/bin/hadoop fs –chown app:app –R  /apps/HBase Now login as $HBase _USER: /u/HBase B/HBase -0.98.3-hadoop2/bin/HBase -daemon.sh –-config $HBase _CONF_DIR start master This command will start the master node. Now let's move to HBase Region server nodes: /u/HBase B/HBase -0.98.3-hadoop2/bin/HBase -daemon.sh –-config $HBase _CONF_DIR start regionserver This command will start the regionservers: For a single machine, direct sudo ./HBase master start can also be used. Please check the logs in case of any logs at this location /opt/HBase B/HBase -0.98.5-hadoop2/logs. You can check the log files and check for any errors: Now let's login using: Sudo su- $HBase _USER /u/HBase B/HBase -0.98.3-hadoop2/bin/HBase shell We will connect HBase to the master. Validate the ZooKeeper setup. If you want to use an external zookeeper, make sure there is no internal HBase based zookeeper running while working with the external zookeeper or existing zookeeper and is not managed by HBase : For this you have to edit /opt/HBase B/HBase -0.98.5-hadoop2/conf/ HBase -env.sh. Change the following statement (HBase _MANAGES_ZK=false): # Tell HBase whether it should manage its own instance of Zookeeper or not. export HBase _MANAGES_ZK=true. Once this is done we can add zoo.cfg to HBase 's CLASSPATH. HBase looks into zoo.cfg as a default lookup for configurations dataDir=/opt/HBase B/zookeeper-3.4.6/zooData # this is the place where the zooData will be present server.1=172.28.182.45:2888:3888 # IP and port for server 01 server.2=172.29.75.37:4888:5888 # IP and port for server 02 You can edit the log4j.properties file which is located at /opt/HBase B/zookeeper-3.4.6/conf and point the location where you want to keep the logs. # Define some default values that can be overridden by system properties: zookeeper.root.logger=INFO, CONSOLE zookeeper.console.threshold=INFO zookeeper.log.dir=. zookeeper.log.file=zookeeper.log zookeeper.log.threshold=DEBUG zookeeper.tracelog.dir=. # you can specify the location here zookeeper.tracelog.file=zookeeper_trace.log Once this is done you start zookeeper with the following command: -bash-4.1$ sudo /u/HBase B/zookeeper-3.4.6/bin/zkServer.sh start Starting zookeeper ... STARTED You can also pipe the log to the ZooKeeper logs: /u/logs//u/HBase B/zookeeper-3.4.6/zoo.out 2>&1 2 : refers to the second file descriptor for the process, that is stderr. > : means re-direct &1:  means the target of the rediretion should be the same location as the first file descriptor i.e stdout How it works Sizing of the environment is very critical for the success of any project, and it's a very complex task to optimize it to the needs. We dissect it into two parts, master and slave setup. We can divide it in the following parts: Master-NameNode Master-Secondary NameNode Master-Jobtracker Master-Yarn Resource Manager Master-HBase Master Slave-DataNode Slave-Map Reduce Tasktracker Slave-Yarn Node Manager Slave-HBase Region server NameNode: The architecture of Hadoop provides us a capability to set up a fully fault tolerant/high availability Hadoop/HBase cluster. In doing so, it requires a master and slave setup. In a fully HA setup, nodes are configured in active passive way; one node is always active at any given point of time and the other node remains as passive. Active node is the one interacting with the clients and works as a coordinator to the clients. The other standby node keeps itself synchronized with the active node and to keep the state intact and live, so that in case of failover it is ready to take the load without any downtime. Now we have to make sure that when the passive node comes up in the event of a failure, the passive node is in perfect sync with the active node, which is currently taking the traffic. This is done by Journal Nodes(JNs), these Journal Nodes use daemon threads to keep the primary and sercodry in perfect sync. Journal Node: By design, JournalNodes will only have single NameNode acting as a active/primary to be a writer at a time. In case of failure of the active/primary, the passive NameNode immediately takes the charge and transforms itself as active, this essentially means this newly active node starts writing to Journal Nodes. Thus it totally avoids the other NameNode to stay in active state, this also acknowledges that the newly active node work as a fail over node. JobTracker: This is an integral part of Hadoop EcoSystem. It works as a service which farms MapReduce task to specific nodes in the cluster. ResourceManager (RM): This responsibility is limited to scheduling, that is, only mediating available resources in the system between different needs for the application like registering new nodes, retiring dead nodes, it dose it by constantly monitoring the heartbeats based on the internal configuration. Due to this core design practice of explicit separation of responsibilities and clear orchestrations of modularity and with the inbuilt and robust scheduler API, This allows the resource manager to scale and support different design needs at one end, and on the other, it allows us to cater to different programming models. HBase Master: The Master server is the main orchestrator for all the region servers in the HBase cluster . Usually, it's placed on the ZooKeeper nodes. In a real cluster configuration, you will have 5 to 6 nodes of Zookeeper. DataNode: It's a real workhorse and does most of the heavy lifting; it runs the MapReduce Job and stores the chunks of HDFS data. The core objective of the data node was to be available on the commodity hardware and should be agnostic to the failures. It keeps some data of HDFS, and the multiple copy of the same data is sprinkled around the cluster. This makes the DataNode architecture fully fault tolerant. This is the reason a data node can have JBOD01 rather rely on the expensive RAID02. MapReduce: Jobs are run on these DataNodes in parallel as a subtask. These subtasks provides the consistent data across the cluster and stays consistent. So we learned about the HBase basics and how to configure and set it up. We set up HBase to store data in Hadoop Distributed File System. We also explored the working structure of RAID and JBOD and the differences between both filesystems. If you found this post useful, be sure to check out the book ‘HBase High Perforamnce Cookbook’ to learn more about configuring HBase in terms of administering and managing clusters as well as other concepts in HBase. Understanding the HBase Ecosystem Configuring HBase 5 Mistake Developers make when working with HBase    
Read more
  • 0
  • 0
  • 2956

article-image-administration-rights-for-power-bi-users
Pravin Dhandre
02 Jul 2018
8 min read
Save for later

Administration rights for Power BI users

Pravin Dhandre
02 Jul 2018
8 min read
In this tutorial, you will understand and learn administration rights/rules for Power BI users. This includes setting and monitoring rules like; who in the organization can utilize which feature, how Power BI Premium capacity is allocated and by whom, and other settings such as embed codes and custom visuals. This article is an excerpt from a book written by Brett Powell titled Mastering Microsoft Power BI. The admin portal is accessible to Office 365 Global Administrators and users mapped to the Power BI service administrator role. To open the admin portal, log in to the Power BI service and select the Admin portal item from the Settings (Gear icon) menu in the top right, as shown in the following screenshot: All Power BI users, including Power BI free users, are able to access the Admin portal. However, users who are not admins can only view the Capacity settings page. The Power BI service administrators and Office 365 global administrators have view and edit access to the following seven pages: Administrators of Power BI most commonly utilize the Tenant settings and Capacity settings as described in the Tenant Settings and Power BI Premium Capacities sections later in this tutorial. However, the admin portal can also be used to manage any approved custom visuals for the organization. Usage metrics The Usage metrics page of the Admin portal provides admins with a Power BI dashboard of several top metrics, such as the most consumed dashboards and the most consumed dashboards by workspace. However, the dashboard cannot be modified and the tiles of the dashboard are not linked to any underlying reports or separate dashboards to support further analysis. Given these limitations, alternative monitoring solutions are recommended, such as the Office 365 audit logs and usage metric datasets specific to Power BI apps. Details of both monitoring options are included in the app usage metrics and Power BI audit log activities sections later in this chapter. Users and Audit logs The Users and Audit logs pages only provide links to the Office 365 admin center. In the admin center, Power BI users can be added, removed and managed. If audit logging is enabled for the organization via the Create audit logs for internal activity and auditing and compliance tenant setting, this audit log data can be retrieved from the Office 365 Security & Compliance Center or via PowerShell. This setting is noted in the following section regarding the Tenant settings tab of the Power BI admin portal. An Office 365 license is not required to utilize the Office 365 admin center for Power BI license assignments or to retrieve Power BI audit log activity. Tenant settings The Tenant settings page of the Admin portal allows administrators to enable or disable various features of the Power BI web service. Likewise, the administrator could allow only a certain security group to embed Power BI content in SaaS applications such as SharePoint Online. The following diagram identifies the 18 tenant settings currently available in the admin portal and the scope available to administrators for configuring each setting: From a data security perspective, the first seven settings within the Export and Sharing and Content packs and apps groups are most important. For example, many organizations choose to disable the Publish to web feature for the entire organization. Additionally, only certain security groups may be allowed to export data or to print hard copies of reports and dashboards. As shown in the Scope column of the previous table and the following example, granular security group configurations are available to minimize risk and manage the overall deployment. Currently, only one tenant setting is available for custom visuals and this setting (Custom visuals settings) can be enabled or disabled for the entire organization only. For organizations that wish to restrict or prohibit custom visuals for security reasons, this setting can be used to eliminate the ability to add, view, share, or interact with custom visuals. More granular controls to this setting are expected later in 2018, such as the ability to define users or security groups of users who are allowed to use custom visuals. In the following screenshot from the Tenant settings page of the Admin portal, only the users within the BI Admin security group who are not also members of the BI Team security group are allowed to publish apps to the entire organization: For example, a report author who also helps administer the On-premises data gateway via the BI Admin security group would be denied the ability to publish apps to the organization given membership in the BI Team security group. Many of the tenant setting configurations will be more simple than this example, particularly for smaller organizations or at the beginning of Power BI deployments. However, as adoption grows and the team responsible for Power BI changes, it's important that the security groups created to help administer these settings are kept up to date. Embed Codes Embed Codes are created and stored in the Power BI service when the Publish to web feature is utilized. As described in the Publish to web section of the previous chapter, this feature allows a Power BI report to be embedded in any website or shared via URL on the public internet. Users with edit rights to the workspace of the published to web content are able to manage the embed codes themselves from within the workspace. However, the admin portal provides visibility and access to embed codes across all workspaces, as shown in the following screenshot: Via the Actions commands on the far right of the Embed Codes page, a Power BI Admin can view the report in a browser (diagonal arrow) or remove the embed code. The Embed Codes page can be helpful to periodically monitor the usage of the Publish to web feature and for scenarios in which data was included in a publish to web report that shouldn't have been, and thus needs to be removed. As shown in the Power BI Tenant settings table referenced in the previous section, this feature can be enabled or disabled for the entire organization or for specific users within security groups. Organizational Custom visuals The Custom Visuals page allows admins to upload and manage custom visuals (.pbiviz files) that have been approved for use within the organization. For example, an organization may have proprietary custom visuals developed internally, which it wishes to expose to business users. Alternatively, the organization may wish to define a set of approved custom visuals, such as only the custom visuals that have been certified by Microsoft. In the following screenshot, the Chiclet Slicer custom visual is added as an organizational custom visual from the Organizational visuals page of the Power BI admin portal: The Organizational visuals page provides a link (Add a custom visual) to launch the form and identifies all uploaded visuals, as well as their last update. Once a visual has been uploaded, it can be deleted but not updated or modified. Therefore, when a new version of an organizational visual becomes available, this visual can be added to the list of organizational visuals with a descriptive title (Chiclet Slicer v2.0). Deleting an organizational custom visual will cause any reports that use this visual to stop rendering. The following screenshot reflects the uploaded Chiclet Slicer custom visual on the Organization visuals page: Once the custom visual has been uploaded as an organizational custom visual, it will be accessible to users in Power BI Desktop. In the following screenshot from Power BI Desktop, the user has opened the MARKETPLACE of custom visuals and selected MY ORGANIZATION: In this screenshot, rather than searching through the MARKETPLACE, the user can go directly to visuals defined by the organization. The marketplace of custom visuals can be launched via either the Visualizations pane or the From Marketplace icon on the Home tab of the ribbon. Organizational custom visuals are not supported for reports or dashboards shared with external users. Additionally, organizational custom visuals used in reports that utilize the publish to web feature will not render outside the Power BI tenant. Moreover, Organizational custom visuals are currently a preview feature. Therefore, users must enable the My organization custom visuals feature via the Preview features tab of the Options window in Power BI Desktop. With this, we got you acquainted with features and processes applicable in administering Power BI for an organization. This includes the configuration of tenant settings in the Power BI admin portal, analyzing the usage of Power BI assets, and monitoring overall user activity via the Office 365 audit logs. If you found this tutorial useful, do check out the book Mastering Microsoft Power BI to develop visually rich, immersive, and interactive Power BI reports and dashboards. Unlocking the secrets of Microsoft Power BI A tale of two tools: Tableau and Power BI Building a Microsoft Power BI Data Model
Read more
  • 0
  • 0
  • 6025

article-image-how-to-migrate-power-bi-datasets-to-microsoft-analysis-services-models
Pravin Dhandre
29 Jun 2018
5 min read
Save for later

How to migrate Power BI datasets to Microsoft Analysis Services models [Tutorial]

Pravin Dhandre
29 Jun 2018
5 min read
The Azure Analysis Services web designer, supports the ability to import a data model contained within a Power BI Desktop file. The imported or migrated model can then take advantage of the resources available to the Azure Analysis Services server and can be accessed from client tools such as Power BI Desktop. Additionally, Azure Analysis Services provides a Visual Studio project file and a Model.bim file for the migrated model that a corporate BI team can use in SSDT for Visual Studio. In this tutorial, you will learn how to migrate your Power BI data to Microsoft Analysis Services for further self-service BI solutions and delivering flexibility to a huge network of stakeholders. This article is an excerpt from a book written by Brett Powell titled Mastering Microsoft Power BI. The following process migrates the model within a Power BI Desktop file to an Azure Analysis Server and downloads the Visual Studio project file for the migrated model: Open the Web designer from the Overview page of the Azure Analysis Services resource in the Azure portal On the Models form, click Add and then provide a name for the new model in the New model form Select the Power BI Desktop File source icon at the bottom and choose the file on the Import menu Click Import to begin the migration process The following screenshot represents these four steps from the Azure Analysis Services web designer: In this example, a Power BI Desktop file (AdWorks Enterprise.pbix) that contains an import mode model based on two on-premises sources (SQL Server and Excel) is imported via the Azure Analysis Services web designer. Once the import is complete, the Field list from the model will be exposed on the right and the imported model will be accessible from client tools like any other Azure Analysis Services model. For example, refreshing the Azure AS server in SQL Server Management Studio will expose the new database (AdWorks Enterprise). Likewise, the Azure Analysis Services database connection in Power BI Desktop (Get Data | Azure) can be used to connect to the migrated model, as shown in the following screenshot: Just like the SQL Server Analysis Services database connection (Get Data | Database), the only required field is the name of the server which is provided in the Azure portal. From the Overview page of the Azure Analysis Services resource, select the Open in Visual Studio project option from the context menu on the far right, as shown in the following screenshot: Save the zip file provided by Azure Analysis Services to a secure local network location. Extract the files from the zip file to expose the Analysis Services project and .bim file, as shown in the following screenshot: In Visual Studio, open a project/solution (File | Open | Project/Solution) and navigate to the downloaded project file (.smproj). Select the project file and click Open. Double-click the Model.bim file in the Solution Explorer window to expose the metadata of the migrated model. All of the objects of the data model built into the Power BI Desktop file including Data Sources, Queries, and Measures are accessible in SSDT just like standard Analysis Services projects, as shown in the following screenshot: The preceding screenshot from Diagram view in SQL Server Data Tools exposes the two on-premises sources of the imported PBIX file via the Tabular Model Explorer window. By default, the deployment server of the Analysis Services project in SSDT is set to the Azure Analysis Services server. As an alternative to a new solution with a single project, an existing solution with an existing Analysis Services project could be opened and the new project from the migration could be added to this solution. This can be accomplished by right-clicking the existing solution's name in the Solution Explorer window and selecting the Existing project from the Add menu (Add | Existing project). This approach allows the corporate BI developer to view and compare both models and optionally implement incremental changes, such as new columns or measures that were exclusive to the Power BI Desktop file. The following screenshot from a solution in Visual Studio includes both the migrated model (via the project file) and an existing Analysis Services model (AdWorks Import): The ability to quickly migrate Power BI datasets to Analysis Services models complements the flexibility and scale of Power BI Premium capacity in allowing organizations to manage and deploy Power BI on their terms. By now, you have successfully migrated your Power BI datasets to Analysis Services and can enjoy the complete flexibility of making further edits to your model for mining much better insights out of it. If you found this tutorial useful, do check out the book Mastering Microsoft Power BI and start producing insightful and beautiful reports from hundreds of data sources and scale across the enterprise. How to use M functions within Microsoft Power BI for querying data Building a Microsoft Power BI Data Model How to build a live interactive visual dashboard in Power BI with Azure Stream
Read more
  • 0
  • 1
  • 9638

article-image-indexing-replicating-and-sharding-in-mongodb-tutorial
Amey Varangaonkar
29 Jun 2018
11 min read
Save for later

Indexing, Replicating, and Sharding in MongoDB [Tutorial]

Amey Varangaonkar
29 Jun 2018
11 min read
MongoDB is an open source, document-oriented, and cross-platform database. It is primarily written in C++. It is also the leading NoSQL database and tied with the SQL database in the fifth position after PostgreSQL. It provides high performance, high availability, and easy scalability. MongoDB uses JSON-like documents with schema. MongoDB, developed by MongoDB Inc., is free to use. It is published under a combination of the GNU Affero General Public License and the Apache License. In this article, we look at the indexing, replication and sharding features offered by MongoDB. The following excerpt is taken from the book 'Seven NoSQL Databases in a Week' written by Aaron Ploetz et al. Introduction to MongoDB indexing Indexes allow efficient execution of MongoDB queries. If we don't have indexes, MongoDB has to scan all the documents in the collection to select those documents that match the criteria. If proper indexing is used, MongoDB can limit the scanning of documents and select documents efficiently. Indexes are a special data structure that store some field values of documents in an easy-to-traverse way. Indexes store the values of specific fields or sets of fields, ordered by the values of fields. The ordering of field values allows us to apply effective algorithms of traversing, such as the mid-search algorithm, and also supports range-based operations effectively. In addition, MongoDB can return sorted results easily. Indexes in MongoDB are the same as indexes in other database systems. MongoDB defines indexes at the collection level and supports indexes on fields and sub-fields of documents. The default _id index MongoDB creates the default _id index when creating a document. The _id index prevents users from inserting two documents with the same _id value. You cannot drop an index on an _id field. The following syntax is used to create an index in MongoDB: >db.collection.createIndex(<key and index type specification>, <options>); The preceding method creates an index only if an index with the same specification does not exist. MongoDB indexes use the B-tree data structure. The following are the different types of indexes: Single field: In addition to the _id field index, MongoDB allows the creation of an index on any single field in ascending or descending order. For a single field index, the order of the index does not matter as MongoDB can traverse indexes in any order. The following is an example of creating an index on the single field where we are creating an index on the firstName field of the user_profiles collection: The query gives acknowledgment after creating the index: This will create an ascending index on the firstName field. To create a descending index, we have to provide -1 instead of 1. Compound index: MongoDB also supports user-defined indexes on multiple fields. The order of fields defined while creating an index has a significant effect. For example, a compound index defined as {firstName:1, age:-1} will sort data by firstName first and then each firstName with age. Multikey index: MongoDB uses multi-key indexes to index the content in the array. If you index the field that contains the array values, MongoDB creates an index for each field in the object of an array. These indexes allow queries to select the document by matching the element or set of elements of the array. MongoDB automatically decides whether to create multi-key indexes or not. Text indexes: MongoDB provides text indexes that support the searching of string contents in the MongoDB collection. To create text indexes, we have to use the db.collection.createIndex() method, but we need to pass a text string literal in the query: You can also create text indexes on multiple fields, for example: Once the index is created, we get an acknowledgment: Compound indexes can be used with text indexes to define an ascending or descending order of the index. Hashed index: To support hash-based sharding, MongoDB supports hashed indexes. In this approach, indexes store the hash value and query, and the select operation checks the hashed indexes. Hashed indexes can support only equality-based operations. They are limited in their performance of range-based operations. Indexes have the following properties: Unique indexes: Indexes should maintain uniqueness. This makes MongoDB drop the duplicate value from indexes. Partial Indexes: Partial indexes apply the index on documents of a collection that match a specified condition. By applying an index on the subset of documents in the collection, partial indexes have a lower storage requirement as well as a reduced performance cost. Sparse index: In the sparse index, MongoDB includes only those documents in the index in which the index field is present, other documents are discarded. We can combine unique indexes with a sparse index to reject documents that have duplicate values but ignore documents that have an indexed key. TTL index: TTL indexes are a special type of indexes where MongoDB will automatically remove the document from the collection after a certain amount of time. Such indexes are ideal to remove machine-generated data, logs, and session information that we need for a finite duration. The following TTL index will automatically delete data from the log table after 3000 seconds: Once the index is created, we get an acknowledgment message: The limitations of indexes: A single collection can have up to 64 indexes only. The qualified index name is <database-name>.<collection-name>.$<index-name> and cannot have more than 128 characters. By default, the index name is a combination of index type and field name. You can specify an index name while using the createIndex() method to ensure that the fully-qualified name does not exceed the limit. There can be no more than 31 fields in the compound index. The query cannot use both text and geospatial indexes. You cannot combine the $text operator, which requires text indexes, with some other query operator required for special indexes. For example, you cannot combine the $text operator with the $near operator. Fields with 2d sphere indexes can only hold geometry data. 2d sphere indexes are specially provided for geometric data operations. For example, to perform operations on co-ordinate, we have to provide data as points on a planer co-ordinate system, [x, y]. For non-geometries, the data query operation will fail. The limitation on data: The maximum number of documents in a capped collection must be less than 2^32. We should define it by the max parameter while creating it. If you do not specify, the capped collection can have any number of documents, which will slow down the queries. The MMAPv1 storage engine will allow 16,000 data files per database, which means it provides the maximum size of 32 TB. We can set the storage.mmapv1.smallfile parameter to reduce the size of the database to 8 TB only. Replica sets can have up to 50 members. Shard keys cannot exceed 512 bytes. Replication in MongoDB A replica set is a group of MongoDB instances that store the same set of data. Replicas are basically used in production to ensure a high availability of data. Redundancy and data availability: because of replication, we have redundant data across the MongoDB instances. We are using replication to provide a high availability of data to the application. If one instance of MongoDB is unavailable, we can serve data from another instance. Replication also increases the read capacity of applications as reading operations can be sent to different servers and retrieve data faster. By maintaining data on different servers, we can increase the locality of data and increase the availability of data for distributed applications. We can use the replica copy for backup, reporting, as well as disaster recovery. Working with replica sets A replica set is a group of MongoDB instances that have the same dataset. A replica set has one arbiter node and multiple data-bearing nodes. In data-bearing nodes, one node is considered the primary node while the other nodes are considered the secondary nodes. All write operations happen at the primary node. Once a write occurs at the primary node, the data is replicated across the secondary nodes internally to make copies of the data available to all nodes and to avoid data inconsistency. If a primary node is not available for the operation, secondary nodes use election algorithms to select one of their nodes as a primary node. A special node, called an arbiter node, is added in the replica set. This arbiter node does not store any data. The arbiter is used to maintain a quorum in the replica set by responding to a heartbeat and election request sent by the secondary nodes in replica sets. As an arbiter does not store data, it is a cost-effective resource used in the election process. If votes in the election process are even, the arbiter adds a voice to choose a primary node. The arbiter node is always the arbiter, it will not change its behavior, unlike a primary or secondary node. The primary node can step down and work as secondary node, while secondary nodes can be elected to perform as primary nodes. Secondary nodes apply read/write operations from a primary node to secondary nodes asynchronously. Automatic failover in replication Primary nodes always communicate with other members every 10 seconds. If it fails to communicate with the others in 10 seconds, other eligible secondary nodes hold an election to choose a primary-acting node among them. The first secondary node that holds the election and receives the majority of votes is elected as a primary node. If there is an arbiter node, its vote is taken into consideration while choosing primary nodes. Read operations Basically, the read operation happens at the primary node only, but we can specify the read operation to be carried out from secondary nodes also. A read from a secondary node does not affect data at the primary node. Reading from secondary nodes can also give inconsistent data. Sharding in MongoDB Sharding is a methodology to distribute data across multiple machines. Sharding is basically used for deployment with a large dataset and high throughput operations. The single database cannot handle a database with large datasets as it requires larger storage, and bulk query operations can use most of the CPU cycles, which slows down processing. For such scenarios, we need more powerful systems. One approach is to add more capacity to a single server, such as adding more memory and processing units or adding more RAM on the single server, this is also called vertical scaling. Another approach is to divide a large dataset across multiple systems and serve a data application to query data from multiple servers. This approach is called horizontal scaling. MongoDB handles horizontal scaling through sharding. Sharded clusters MongoDB's sharding consists of the following components: Shard: Each shard stores a subset of sharded data. Also, each shard can be deployed as a replica set. Mongos: Mongos provide an interface between a client application and sharded cluster to route the query. Config server: The configuration server stores the metadata and configuration settings for the cluster. The MongoDB data is sharded at the collection level and distributed across sharded clusters. Shard keys: To distribute documents in collections, MongoDB partitions the collection using the shard key. MongoDB shards data into chunks. These chunks are distributed across shards in sharded clusters. Advantages of sharding Here are some of the advantages of sharding: When we use sharding, the load of the read/write operations gets distributed across sharded clusters. As sharding is used to distribute data across a shard cluster, we can increase the storage capacity by adding shards horizontally. MongoDB allows continuing the read/write operation even if one of the shards is unavailable. In the production environment, shards should deploy with a replication mechanism to maintain high availability and add fault tolerance in a system. Indexing, sharding and replication are three of the most important tasks to perform on any database, as they ensure optimal querying and database performance. In this article, we saw how MongoDB facilitates these tasks and makes them as easy as possible for the administrators to take care of. If you found the excerpt to be useful, make sure you check out our book Seven NoSQL Databases in a Week to learn more about the different database administration techniques in MongoDB, as well as the other popularly used NoSQL databases such as Redis, HBase, Neo4j, and more. Read more Top 5 programming languages for crunching Big Data effectively Top 5 NoSQL Databases Is Apache Spark today’s Hadoop?
Read more
  • 0
  • 0
  • 7946

article-image-build-actuator-application-with-raspberry-pi-3
Gebin George
28 Jun 2018
10 min read
Save for later

Build an Actuator app for controlling Illumination with Raspberry Pi 3

Gebin George
28 Jun 2018
10 min read
In this article, we will look at how to build an actuator application for controlling illuminations. This article is an excerpt from the book, Mastering Internet of Things, written by Peter Waher. This book will help you design and implement IoT solutions with single board computers. Preparing our project Let's create a new Universal Windows Platform application project. This time, we'll call it Actuator. We can also use the Raspberry Pi 3, even though we will only use the relay in this project. To make the persistence of application states even easier, we'll also include the latest version of the NuGet package Waher.Runtime.Settings in the project. It uses the underlying object database defined by Waher.Persistence to persist application settings. Defining control parameters Actuators come in all sorts, types, and sizes, from the very complex to the very simple. While it would be possible to create a proprietary format that configures the actuator in a bulk operation, such a method is doomed to fail if you aim for any type of interoperable communication. Since the internet is based on interoperability as a core principle, we should consider this from the start, during the design phase. Interoperability means devices can learn to operate together, even if they are from different manufacturers. To achieve this, devices must be able to describe what they can do, in a way that each participant understands. To be able to do this, we need a way to break down (divide and conquer) a complex actuator into parts that are easily described and understood. One way is to see an actuator as a collection of control parameters. Each control parameter is a named parameter with a simple and recognizable data type. (In the same way, we can see a sensor as a collection of sensor data fields.): For our example, we will only need one control parameter: A Boolean control parameter controlling the state of our relay. We'll just call it Output, for simplicity. Understanding relays Relays, simply put, are electric switches that we can control using a small output signal. They're perfect for small controllers, like Raspberry Pi, to switch other circuits with higher voltages on and off. The simplest example is to use a relay to switch a lamp on and off. We can't light the lamp using the voltage available to us in Raspberry Pi, but we can use a relay as a switch to control the lamp. The principal part of a normal relay is a coil. When electricity runs through it, it magnetizes an iron core, which in turn moves a lever from the Normally Closed (NC) connector to the Normally Open (NO) connector. When electricity is cut, a spring returns the lever from the NO connector to the NC connector. This movement of the lever from one connector to the other causes a characteristic clicking sound. This tells you that the relay works. The lever in turn is connected to the Common Ground (COM) connector. The following figure illustrates how a simple relay is constructed. We control the flow of the current through the coil (L1) using our output SIGNAL (D1 in our case). Internally, in the relay, a resistor (R1) is placed before the base pin of the transistor (T1), to adapt the signal voltage to an appropriate level. When we connect or cut the current through the coil, it will induce a reverse current. This may be harmful for the transistor when the current is being cut. For that reason, a fly-back diode (D1) is added, allowing excess current to be fed back, avoiding harm to the transistor: Connecting our lamp Now that we know how a relay works, it's relatively easy to connect our lamp to it. Since we want the lamp to be illuminated when we turn the relay on (set D1to HIGH), we will use the NO and COM connectors, and let the NC connector be. If the lamp has a normal two-wire AC cable, we can insert the relay into the AC circuit by simply cutting one of the wires, inserting one end into the NO connector and the other into the COM connector, as is illustrated in the following figure: Be sure to follow appropriate safety regulations when working with electricity. Connecting an LED An alternative to working with the alternating current (AC) is to use a low-power direct current (DC) source and an LED to simulate a lamp. You can connect the COM connector to a resistor and an LED, and then to ground (GND) on one end, and the NO directly to the 5V or 3.3V source on the Raspberry Pi on the other end. The size of the resistor is determined by how much current the LED needs to light up, and the voltage source you choose. If the LED needs 20 mA, and you connect it to a 5V source, Ohms Law tells us we need an R = U/I = 5V/0.02A = 250 Ω resistor. The following figure illustrates this: Controlling the output The relay is connected to our digital output pin 9 on the Arduino board. As such, controlling it is a simple call to the digitalWrite() method on our arduino object. Since we will need to perform this control action from various locations in code in later chapters, we'll create a method for it: internal async Task SetOutput(bool On, string Actor) { if (this.arduino != null) { this.arduino.digitalWrite(9, On ? PinState.HIGH : PinState.LOW); The first parameter simply states the new value of the control parameter. We'll add a second parameter that describes who is making the requested change. This will come in handy later, when we allow online users to change control parameters. Persisting control parameter states If the device reboots for some reason, for instance after a power outage, it's normally desirable that it returns to the state it was in before it shut down. For this, we need to persist the output value. We can use the object database defined in Waher.Persistence and Waher.Persistence.Files for this. But for simple control states, we don't need to create our own data-bearing classes. That has already been done by Waher.Runtime.Settings. To use it, we first include the NuGet, as described earlier. We must also include its assembly when we initialize the runtime inventory, which is used by the object database: Types.Initialize( typeof(FilesProvider).GetTypeInfo().Assembly, typeof(App).GetTypeInfo().Assembly, typeof(RuntimeSettings).GetTypeInfo().Assembly); Depending on the build version selected when creating your UWP application, different versions of .NET Standard will be supported. Build 10586 for instance, only supports .NET Standard up to v1.4. Build 16299, however, supports .NET Standard up to v2.0. The Waher.Runtime.Inventory.Loader library, available as a NuGet package, provides the capability to loop through existing assemblies in a simple manner, but it requires support for .NET Standard 1.5. You can call its TypesLoader.Initialize() method to initialize Waher.Runtime.Inventory with all assemblies available in the runtime. It also dynamically loads all permitted assemblies available in the application folder that have not been loaded. Saving the current control state is then simply a matter of calling the Set() or SetAsync() methods on the static RuntimeSettings class, defined in the Waher.Runtime.Settings namespace: await RuntimeSettings.SetAsync("Actuator.Output", On); During the initialization of the device, we then call the Get() or GetAsync() methods to get the last value, if it exists. If it does not exist, a default value we define is returned: bool LastOn = await RuntimeSettings.GetAsync("Actuator.Output", false); this.arduino.digitalWrite(1, LastOn ? PinState.HIGH : PinState.LOW); Logging important control events In distributed IoT control applications, it's vitally important to make sure unauthorized access to the system is avoided. While we will dive deeper into this subject in later chapters, one important tool we can start using it to log everything of a security interest in the event log. We can decide what to do with the event log later, whether we want to analyze or store it locally or distribute it in the network for analysis somewhere else. But unless we start logging events of security interest directly when we develop, we risk forgetting logging certain events later. So, let's log an event every time the output is set: Log.Informational("Setting Control Parameter.", string.Empty, Actor ?? "Windows user", new KeyValuePair<string, object>("Output", On)); If the Actor parameter is null, we assume the control parameter has been set from the Windows GUI. We use this fact, to update the window if the change has been requested from somewhere else: if (Actor != null) await MainPage.Instance.OutputSet(On); Using Raspberry Pi GPIO pins directly The Raspberry Pi can also perform input and output without an Arduino board. But the General-Purpose Input/Output (GPIO) pins available only supports digital input and output. Since the relay module is controlled through a digital output, we can connect it directly to the Raspberry Pi, if we want. That way, we don't need the Arduino board. (We wouldn't be able to test-run the application on the local machine either, though.) Checking whether GPIO is available GPIO pins are accessed through the GpioController class defined in the Windows.Devices.Gpio namespace. First, we must check that GPIO is available on the machine. We do this by getting the default controller, and checking whether it's available: gpio = GpioController.GetDefault(); if (gpio != null) { ... } else Log.Error("Unable to get access to GPIO pin " + gpioOutputPin.ToString()); Initializing the GPIO output pin Once we have access to the controller, we can try to open exclusive access to the GPIO pin we've connected the relay to: if (gpio.TryOpenPin(gpioOutputPin, GpioSharingMode.Exclusive, out this.gpioPin, out GpioOpenStatus Status) && Status == GpioOpenStatus.PinOpened) { ... } else Log.Error("Unable to get access to GPIO pin " + gpioOutputPin.ToString()); Through the GpioPin object gpioPin, we can now control the pin. The first step is to set the operating mode for the pin. This is done by calling the SetDriveMode() method. There are many different modes a pin can be set to, not all necessarily supported by the underlying firmware and hardware. To check that a mode is supported, call the IsDriveModeSupported() method first: if (this.gpioPin.IsDriveModeSupported(GpioPinDriveMode.Output)) { This.gpioPin.SetDriveMode(GpioPinDriveMode.Output); ... } else Log.Error("Output mode not supported for GPIO pin " + gpioOutputPin.ToString()); There are various output modes available: Output, OutputOpenDrain, OutputOpenDrainPullUp, OutputOpenSource, and OutputOpenSourcePullDown. The code documentation for each flag describes the particulars of each option. Setting the GPIO pin output To set the actual output value, we call the Write() method on the pin object: bool LastOn = await RuntimeSettings.GetAsync("Actuator.Output", false); this.gpioPin.Write(LastOn ? GpioPinValue.High : GpioPinValue.Low); We need to make a similar change in the SetOutput() method. The Actuator project in the MIOT repository uses the Arduino use case by default. The GPIO code is also available through conditional compiling. It is activated by uncommenting the GPIO switch definition on the first row of the App.xaml.cs file. You can also perform Digital Input using principles similar to the preceding ones, with some differences. First, you select an input drive mode: Input, InputPullUp or InputPullDown. You then use the Read() method to read the current state of the pin. You can also use the ValueChanged event to get a notification whenever the input pin changes value. We saw how to create a simple actuator app for the Raspberry Pi using C#. If you found our post useful, do check out this title Mastering Internet of Things, to build complex projects using motions detectors, controllers, sensors, and Raspberry Pi 3. Should you go with Arduino Uno or Raspberry Pi 3 for your next IoT project? Build your first Raspberry Pi project Meet the Coolest Raspberry Pi Family Member: Raspberry Pi Zero W Wireless
Read more
  • 0
  • 0
  • 3939

article-image-build-and-train-rnn-chatbot-using-tensorflow
Sunith Shetty
28 Jun 2018
21 min read
Save for later

Build and train an RNN chatbot using TensorFlow [Tutorial]

Sunith Shetty
28 Jun 2018
21 min read
Chatbots are increasingly used as a way to provide assistance to users. Many companies, including banks, mobile/landline companies and large e-sellers now use chatbots for customer assistance and for helping users in pre and post sales queries. They are a great tool for companies which don't need to provide additional customer service capacity for trivial questions: they really look like a win-win situation! In today’s tutorial, we will understand how to train an automatic chatbot that will be able to answer simple and generic questions, and how to create an endpoint over HTTP for providing the answers via an API. This article is an excerpt from a book written by Luca Massaron, Alberto Boschetti, Alexey Grigorev, Abhishek Thakur, and Rajalingappaa Shanmugamani titled TensorFlow Deep Learning Projects. There are mainly two types of chatbot: the first is a simple one, which tries to understand the topic, always providing the same answer for all questions about the same topic. For example, on a train website, the questions Where can I find the timetable of the City_A to City_B service? and What's the next train departing from City_A? will likely get the same answer, that could read Hi! The timetable on our network is available on this page: <link>. This types of chatbots use classification algorithms to understand the topic (in the example, both questions are about the timetable topic). Given the topic, they always provide the same answer. Usually, they have a list of N topics and N answers; also, if the probability of the classified topic is low (the question is too vague, or it's on a topic not included in the list), they usually ask the user to be more specific and repeat the question, eventually pointing out other ways to do the question (send an email or call the customer service number, for example). The second type of chatbots is more advanced, smarter, but also more complex. For those, the answers are built using an RNN, in the same way, that machine translation is performed. Those chatbots are able to provide more personalized answers, and they may provide a more specific reply. In fact, they don't just guess the topic, but with an RNN engine, they're able to understand more about the user's questions and provide the best possible answer: in fact, it's very unlikely you'll get the same answers with two different questions using these types of chatbots. The input corpus Unfortunately, we haven't found any consumer-oriented dataset that is open source and freely available on the Internet. Therefore, we will train the chatbot with a more generic dataset, not really focused on customer service. Specifically, we will use the Cornell Movie Dialogs Corpus, from the Cornell University. The corpus contains the collection of conversations extracted from raw movie scripts, therefore the chatbot will be able to answer more to fictional questions than real ones. The Cornell corpus contains more than 200,000 conversational exchanges between 10+ thousands of movie characters, extracted from 617 movies. The dataset is available here: https://www.cs.cornell.edu/~cristian/Cornell_Movie-Dialogs_Corpus.html. We would like to thank the authors for having released the corpus: that makes experimentation, reproducibility and knowledge sharing easier. The dataset comes as a .zip archive file. After decompressing it, you'll find several files in it: README.txt contains the description of the dataset, the format of the corpora files, the details on the collection procedure and the author's contact. Chameleons.pdf is the original paper for which the corpus has been released. Although the goal of the paper is strictly not around chatbots, it studies the language used in dialogues, and it's a good source of information to understanding more movie_conversations.txt contains all the dialogues structure. For each conversation, it includes the ID of the two characters involved in the discussion, the ID of the movie and the list of sentences IDs (or utterances, to be more precise) in chronological order. For example, the first line of the file is: u0 +++$+++ u2 +++$+++ m0 +++$+++ ['L194', 'L195', 'L196', 'L197'] That means that user u0 had a conversation with user u2 in the movie m0 and the conversation had 4 utterances: 'L194', 'L195', 'L196' and 'L197' movie_lines.txt contains the actual text of each utterance ID and the person who produced it. For example, the utterance L195 is listed here as: L195 +++$+++ u2 +++$+++ m0 +++$+++ CAMERON +++$+++ Well, I thought we'd start with pronunciation, if that's okay with you. So, the text of the utterance L195 is Well, I thought we'd start with pronunciation, if that's okay with you. And it was pronounced by the character u2 whose name is CAMERON in the movie m0. movie_titles_metadata.txt contains information about the movies, including the title, year, IMDB rating, the number of votes in IMDB and the genres. For example, the movie m0 here is described as: m0 +++$+++ 10 things i hate about you +++$+++ 1999 +++$+++ 6.90 +++$+++ 62847 +++$+++ ['comedy', 'romance'] So, the title of the movie whose ID is m0 is 10 things i hate about you, it's from 1999, it's a comedy with romance and it received almost 63 thousand votes on IMDB with an average score of 6.9 (over 10.0) movie_characters_metadata.txt contains information about the movie characters, including the name the title of the movie where he/she appears, the gender (if known) and the position in the credits (if known). For example, the character “u2” appears in this file with this description: u2 +++$+++ CAMERON +++$+++ m0 +++$+++ 10 things i hate about you +++$+++ m +++$+++ 3 The character u2 is named CAMERON, it appears in the movie m0 whose title is 10 things i hate about you, his gender is male and he's the third person appearing in the credits. raw_script_urls.txt contains the source URL where the dialogues of each movie can be retrieved. For example, for the movie m0 that's it: m0 +++$+++ 10 things i hate about you +++$+++ http://www.dailyscript.com/scripts/10Things.html As you will have noticed, most files use the token  +++$+++  to separate the fields. Beyond that, the format looks pretty straightforward to parse. Please take particular care while parsing the files: their format is not UTF-8 but ISO-8859-1. Creating the training dataset Let's now create the training set for the chatbot. We'd need all the conversations between the characters in the correct order: fortunately, the corpora contains more than what we actually need. For creating the dataset, we will start by downloading the zip archive, if it's not already on disk. We'll then decompress the archive in a temporary folder (if you're using Windows, that should be C:Temp), and we will read just the movie_lines.txt and the movie_conversations.txt files, the ones we really need to create a dataset of consecutive utterances. Let's now go step by step, creating multiple functions, one for each step, in the file corpora_downloader.py. The first function we need is to retrieve the file from the Internet, if not available on disk. def download_and_decompress(url, storage_path, storage_dir): import os.path directory = storage_path + "/" + storage_dir zip_file = directory + ".zip" a_file = directory + "/cornell movie-dialogs corpus/README.txt" if not os.path.isfile(a_file): import urllib.request import zipfile urllib.request.urlretrieve(url, zip_file) with zipfile.ZipFile(zip_file, "r") as zfh: zfh.extractall(directory) return This function does exactly that: it checks whether the “README.txt” file is available locally; if not, it downloads the file (thanks for the urlretrieve function in the urllib.request module) and it decompresses the zip (using the zipfile module). The next step is to read the conversation file and extract the list of utterance IDS. As a reminder, its format is: u0 +++$+++ u2 +++$+++ m0 +++$+++ ['L194', 'L195', 'L196', 'L197'], therefore what we're looking for is the fourth element of the list after we split it on the token  +++$+++ . Also, we'd need to clean up the square brackets and the apostrophes to have a clean list of IDs. For doing that, we shall import the re module, and the function will look like this. import re def read_conversations(storage_path, storage_dir): filename = storage_path + "/" + storage_dir + "/cornell movie-dialogs corpus/movie_conversations.txt" with open(filename, "r", encoding="ISO-8859-1") as fh: conversations_chunks = [line.split(" +++$+++ ") for line in fh] return [re.sub('[[]']', '', el[3].strip()).split(", ") for el in conversations_chunks] As previously said, remember to read the file with the right encoding, otherwise, you'll get an error. The output of this function is a list of lists, each of them containing the sequence of utterance IDS in a conversation between characters. Next step is to read and parse the movie_lines.txt file, to extract the actual utterances texts. As a reminder, the file looks like this line: L195 +++$+++ u2 +++$+++ m0 +++$+++ CAMERON +++$+++ Well, I thought we'd start with pronunciation, if that's okay with you. Here, what we're looking for are the first and the last chunks. def read_lines(storage_path, storage_dir): filename = storage_path + "/" + storage_dir + "/cornell movie-dialogs corpus/movie_lines.txt" with open(filename, "r", encoding="ISO-8859-1") as fh: lines_chunks = [line.split(" +++$+++ ") for line in fh] return {line[0]: line[-1].strip() for line in lines_chunks} The very last bit is about tokenization and alignment. We'd like to have a set whose observations have two sequential utterances. In this way, we will train the chatbot, given the first utterance, to provide the next one. Hopefully, this will lead to a smart chatbot, able to reply to multiple questions. Here's the function: def get_tokenized_sequencial_sentences(list_of_lines, line_text): for line in list_of_lines: for i in range(len(line) - 1): yield (line_text[line[i]].split(" "), line_text[line[i+1]].split(" ")) Its output is a generator containing a tuple of the two utterances (the one on the right follows temporally the one on the left). Also, utterances are tokenized on the space character. Finally, we can wrap up everything into a function, which downloads the file and unzip it (if not cached), parse the conversations and the lines, and format the dataset as a generator. As a default, we will store the files in the /tmp directory: def retrieve_cornell_corpora(storage_path="/tmp", storage_dir="cornell_movie_dialogs_corpus"): download_and_decompress("http://www.cs.cornell.edu/~cristian/data/cornell_movie_dialogs_corpus.zip", storage_path, storage_dir) conversations = read_conversations(storage_path, storage_dir) lines = read_lines(storage_path, storage_dir) return tuple(zip(*list(get_tokenized_sequencial_sentences(conversations, lines)))) At this point, our training set looks very similar to the training set used in the translation project. We can, therefore, use some pieces of code we've developed in the machine learning translation article. For example, the corpora_tools.py file can be used here without any change (also, it requires the data_utils.py). Given that file, we can dig more into the corpora, with a script to check the chatbot input. To inspect the corpora, we can use the corpora_tools.py, and the file we've previously created. Let's retrieve the Cornell Movie Dialog Corpus, format the corpora and print an example and its length: from corpora_tools import * from corpora_downloader import retrieve_cornell_corpora sen_l1, sen_l2 = retrieve_cornell_corpora() print("# Two consecutive sentences in a conversation") print("Q:", sen_l1[0]) print("A:", sen_l2[0]) print("# Corpora length (i.e. number of sentences)") print(len(sen_l1)) assert len(sen_l1) == len(sen_l2) This code prints an example of two tokenized consecutive utterances, and the number of examples in the dataset, that is more than 220,000: # Two consecutive sentences in a conversation Q: ['Can', 'we', 'make', 'this', 'quick?', '', 'Roxanne', 'Korrine', 'and', 'Andrew', 'Barrett', 'are', 'having', 'an', 'incredibly', 'horrendous', 'public', 'break-', 'up', 'on', 'the', 'quad.', '', 'Again.'] A: ['Well,', 'I', 'thought', "we'd", 'start', 'with', 'pronunciation,', 'if', "that's", 'okay', 'with', 'you.'] # Corpora length (i.e. number of sentences) 221616 Let's now clean the punctuation in the sentences, lowercase them and limits their size to 20 words maximum (that is examples where at least one of the sentences is longer than 20 words are discarded). This is needed to standardize the tokens: clean_sen_l1 = [clean_sentence(s) for s in sen_l1] clean_sen_l2 = [clean_sentence(s) for s in sen_l2] filt_clean_sen_l1, filt_clean_sen_l2 = filter_sentence_length(clean_sen_l1, clean_sen_l2) print("# Filtered Corpora length (i.e. number of sentences)") print(len(filt_clean_sen_l1)) assert len(filt_clean_sen_l1) == len(filt_clean_sen_l2) This leads us to almost 140,000 examples: # Filtered Corpora length (i.e. number of sentences) 140261 Then, let's create the dictionaries for the two sets of sentences. Practically, they should look the same (since the same sentence appears once on the left side, and once in the right side) except there might be some changes introduced by the first and last sentences of a conversation (they appear only once). To make the best out of our corpora, let's build two dictionaries of words and then encode all the words in the corpora with their dictionary indexes: dict_l1 = create_indexed_dictionary(filt_clean_sen_l1, dict_size=15000, storage_path="/tmp/l1_dict.p") dict_l2 = create_indexed_dictionary(filt_clean_sen_l2, dict_size=15000, storage_path="/tmp/l2_dict.p") idx_sentences_l1 = sentences_to_indexes(filt_clean_sen_l1, dict_l1) idx_sentences_l2 = sentences_to_indexes(filt_clean_sen_l2, dict_l2) print("# Same sentences as before, with their dictionary ID") print("Q:", list(zip(filt_clean_sen_l1[0], idx_sentences_l1[0]))) print("A:", list(zip(filt_clean_sen_l2[0], idx_sentences_l2[0]))) That prints the following output. We also notice that a dictionary of 15 thousand entries doesn't contain all the words and more than 16 thousand (less popular) of them don't fit into it: [sentences_to_indexes] Did not find 16823 words [sentences_to_indexes] Did not find 16649 words # Same sentences as before, with their dictionary ID Q: [('well', 68), (',', 8), ('i', 9), ('thought', 141), ('we', 23), ("'", 5), ('d', 83), ('start', 370), ('with', 46), ('pronunciation', 3), (',', 8), ('if', 78), ('that', 18), ("'", 5), ('s', 12), ('okay', 92), ('with', 46), ('you', 7), ('.', 4)] A: [('not', 31), ('the', 10), ('hacking', 7309), ('and', 23), ('gagging', 8761), ('and', 23), ('spitting', 6354), ('part', 437), ('.', 4), ('please', 145), ('.', 4)] As the final step, let's add paddings and markings to the sentences: data_set = prepare_sentences(idx_sentences_l1, idx_sentences_l2, max_length_l1, max_length_l2) print("# Prepared minibatch with paddings and extra stuff") print("Q:", data_set[0][0]) print("A:", data_set[0][1]) print("# The sentence pass from X to Y tokens") print("Q:", len(idx_sentences_l1[0]), "->", len(data_set[0][0])) print("A:", len(idx_sentences_l2[0]), "->", len(data_set[0][1])) And that, as expected, prints: # Prepared minibatch with paddings and extra stuff Q: [0, 68, 8, 9, 141, 23, 5, 83, 370, 46, 3, 8, 78, 18, 5, 12, 92, 46, 7, 4] A: [1, 31, 10, 7309, 23, 8761, 23, 6354, 437, 4, 145, 4, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0] # The sentence pass from X to Y tokens Q: 19 -> 20 A: 11 -> 22 Training the chatbot After we're done with the corpora, it's now time to work on the model. This project requires again a sequence to sequence model, therefore we can use an RNN. Even more, we can reuse part of the code from the previous project: we'd just need to change how the dataset is built, and the parameters of the model. We can then copy the training script, and modify the build_dataset function, to use the Cornell dataset. Mind that the dataset used in this article is bigger than the one used in the machine learning translation article, therefore you may need to limit the corpora to a few dozen thousand lines. On a 4 years old laptop with 8GB RAM, we had to select only the first 30 thousand lines, otherwise, the program ran out of memory and kept swapping. As a side effect of having fewer examples, even the dictionaries are smaller, resulting in less than 10 thousands words each. def build_dataset(use_stored_dictionary=False): sen_l1, sen_l2 = retrieve_cornell_corpora() clean_sen_l1 = [clean_sentence(s) for s in sen_l1][:30000] ### OTHERWISE IT DOES NOT RUN ON MY LAPTOP clean_sen_l2 = [clean_sentence(s) for s in sen_l2][:30000] ### OTHERWISE IT DOES NOT RUN ON MY LAPTOP filt_clean_sen_l1, filt_clean_sen_l2 = filter_sentence_length(clean_sen_l1, clean_sen_l2, max_len=10) if not use_stored_dictionary: dict_l1 = create_indexed_dictionary(filt_clean_sen_l1, dict_size=10000, storage_path=path_l1_dict) dict_l2 = create_indexed_dictionary(filt_clean_sen_l2, dict_size=10000, storage_path=path_l2_dict) else: dict_l1 = pickle.load(open(path_l1_dict, "rb")) dict_l2 = pickle.load(open(path_l2_dict, "rb")) dict_l1_length = len(dict_l1) dict_l2_length = len(dict_l2) idx_sentences_l1 = sentences_to_indexes(filt_clean_sen_l1, dict_l1) idx_sentences_l2 = sentences_to_indexes(filt_clean_sen_l2, dict_l2) max_length_l1 = extract_max_length(idx_sentences_l1) max_length_l2 = extract_max_length(idx_sentences_l2) data_set = prepare_sentences(idx_sentences_l1, idx_sentences_l2, max_length_l1, max_length_l2) return (filt_clean_sen_l1, filt_clean_sen_l2), data_set, (max_length_l1, max_length_l2), (dict_l1_length, dict_l2_length) By inserting this function into the train_translator.py file and rename the file as train_chatbot.py, we can run the training of the chatbot. After a few iterations, you can stop the program and you'll see something similar to this output: [sentences_to_indexes] Did not find 0 words [sentences_to_indexes] Did not find 0 words global step 100 learning rate 1.0 step-time 7.708967611789704 perplexity 444.90090078460474 eval: perplexity 57.442316329639176 global step 200 learning rate 0.990234375 step-time 7.700247814655302 perplexity 48.8545568311572 eval: perplexity 42.190180314697045 global step 300 learning rate 0.98046875 step-time 7.69800933599472 perplexity 41.620538109894945 eval: perplexity 31.291903031786116 ... ... ... global step 2400 learning rate 0.79833984375 step-time 7.686293318271639 perplexity 3.7086356605442767 eval: perplexity 2.8348589631663046 global step 2500 learning rate 0.79052734375 step-time 7.689657487869262 perplexity 3.211876894960698 eval: perplexity 2.973809378544393 global step 2600 learning rate 0.78271484375 step-time 7.690396382808681 perplexity 2.878854805600354 eval: perplexity 2.563583924617356 Again, if you change the settings, you may end up with a different perplexity. To obtain these results, we set the RNN size to 256 and 2 layers, the batch size of 128 samples, and the learning rate to 1.0. At this point, the chatbot is ready to be tested. Although you can test the chatbot with the same code as in the test_translator.py, here we would like to do a more elaborate solution, which allows exposing the chatbot as a service with APIs. Chatbox API First of all, we need a web framework to expose the API. In this project, we've chosen Bottle, a lightweight simple framework very easy to use. To install the package, run pip install bottle from the command line. To gather further information and dig into the code, take a look at the project webpage, https://bottlepy.org. Let's now create a function to parse an arbitrary sentence provided by the user as an argument. All the following code should live in the test_chatbot_aas.py file. Let's start with some imports and the function to clean, tokenize and prepare the sentence using the dictionary: import pickle import sys import numpy as np import tensorflow as tf import data_utils from corpora_tools import clean_sentence, sentences_to_indexes, prepare_sentences from train_chatbot import get_seq2seq_model, path_l1_dict, path_l2_dict model_dir = "/home/abc/chat/chatbot_model" def prepare_sentence(sentence, dict_l1, max_length): sents = [sentence.split(" ")] clean_sen_l1 = [clean_sentence(s) for s in sents] idx_sentences_l1 = sentences_to_indexes(clean_sen_l1, dict_l1) data_set = prepare_sentences(idx_sentences_l1, [[]], max_length, max_length) sentences = (clean_sen_l1, [[]]) return sentences, data_set The function prepare_sentence does the following: Tokenizes the input sentence Cleans it (lowercase and punctuation cleanup) Converts tokens to dictionary IDs Add markers and paddings to reach the default length Next, we will need a function to convert the predicted sequence of numbers to an actual sentence composed of words. This is done by the function decode, which runs the prediction given the input sentence and with softmax predicts the most likely output. Finally, it returns the sentence without paddings and markers: def decode(data_set): with tf.Session() as sess: model = get_seq2seq_model(sess, True, dict_lengths, max_sentence_lengths, model_dir) model.batch_size = 1 bucket = 0 encoder_inputs, decoder_inputs, target_weights = model.get_batch( {bucket: [(data_set[0][0], [])]}, bucket) _, _, output_logits = model.step(sess, encoder_inputs, decoder_inputs, target_weights, bucket, True) outputs = [int(np.argmax(logit, axis=1)) for logit in output_logits] if data_utils.EOS_ID in outputs: outputs = outputs[1:outputs.index(data_utils.EOS_ID)] tf.reset_default_graph() return " ".join([tf.compat.as_str(inv_dict_l2[output]) for output in outputs]) Finally, the main function, that is, the function to run in the script: if __name__ == "__main__": dict_l1 = pickle.load(open(path_l1_dict, "rb")) dict_l1_length = len(dict_l1) dict_l2 = pickle.load(open(path_l2_dict, "rb")) dict_l2_length = len(dict_l2) inv_dict_l2 = {v: k for k, v in dict_l2.items()} max_lengths = 10 dict_lengths = (dict_l1_length, dict_l2_length) max_sentence_lengths = (max_lengths, max_lengths) from bottle import route, run, request @route('/api') def api(): in_sentence = request.query.sentence _, data_set = prepare_sentence(in_sentence, dict_l1, max_lengths) resp = [{"in": in_sentence, "out": decode(data_set)}] return dict(data=resp) run(host='127.0.0.1', port=8080, reloader=True, debug=True) Initially, it loads the dictionary and prepares the inverse dictionary. Then, it uses the Bottle API to create an HTTP GET endpoint (under the /api URL). The route decorator sets and enriches the function to run when the endpoint is contacted via HTTP GET. In this case, the api() function is run, which first reads the sentence passed as HTTP parameter, then calls the prepare_sentence function, described above, and finally runs the decoding step. What's returned is a dictionary containing both the input sentence provided by the user and the reply of the chatbot. Finally, the webserver is turned on, on the localhost at port 8080. Isn't very easy to have a chatbot as a service with Bottle? It's now time to run it and check the outputs. To run it, run from the command line: $> python3 –u test_chatbot_aas.py Then, let's start querying the chatbot with some generic questions, to do so we can use CURL, a simple command line; also all the browsers are ok, just remember that the URL should be encoded, for example, the space character should be replaced with its encoding, that is, %20. Curl makes things easier, having a simple way to encode the URL request. Here are a couple of examples: $> curl -X GET -G http://127.0.0.1:8080/api --data-urlencode "sentence=how are you?" {"data": [{"out": "i ' m here with you .", "in": "where are you?"}]} $> curl -X GET -G http://127.0.0.1:8080/api --data-urlencode "sentence=are you here?" {"data": [{"out": "yes .", "in": "are you here?"}]} $> curl -X GET -G http://127.0.0.1:8080/api --data-urlencode "sentence=are you a chatbot?" {"data": [{"out": "you ' for the stuff to be right .", "in": "are you a chatbot?"}]} $> curl -X GET -G http://127.0.0.1:8080/api --data-urlencode "sentence=what is your name ?" {"data": [{"out": "we don ' t know .", "in": "what is your name ?"}]} $> curl -X GET -G http://127.0.0.1:8080/api --data-urlencode "sentence=how are you?" {"data": [{"out": "that ' s okay .", "in": "how are you?"}]} If the system doesn't work with your browser, try encoding the URL, for example: $> curl -X GET http://127.0.0.1:8080/api?sentence=how%20are%20you? {"data": [{"out": "that ' s okay .", "in": "how are you?"}]}. Replies are quite funny; always remember that we trained the chatbox on movies, therefore the type of replies follow that style. To turn off the webserver, use Ctrl + C. To summarize, we've learned to implement a chatbot, which is able to respond to questions through an HTTP endpoint and a GET API. To know more how to design deep learning systems for a variety of real-world scenarios using TensorFlow, do checkout this book TensorFlow Deep Learning Projects. Facebook’s Wit.ai: Why we need yet another chatbot development framework? How to build a chatbot with Microsoft Bot framework Top 4 chatbot development frameworks for developers
Read more
  • 0
  • 1
  • 20477
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at €14.99/month. Cancel anytime
article-image-how-to-implement-immutability-functions-in-kotlin
Aaron Lazar
27 Jun 2018
8 min read
Save for later

How to implement immutability functions in Kotlin [Tutorial]

Aaron Lazar
27 Jun 2018
8 min read
Unlike Clojure, Haskell, F#, and the likes, Kotlin is not a pure functional programming language, where immutability is forced; rather, we may refer to Kotlin as a perfect blend of functional programming and OOP languages. It contains the major benefits of both worlds. So, instead of forcing immutability like pure functional programming languages, Kotlin encourages immutability, giving it automatic preference wherever possible. In this article, we'll understand the various methods of implementing immutability in Kotlin. This article has been taken from the book, Functional Kotlin, by Mario Arias and Rivu Chakraborty. In other words, Kotlin has immutable variables (val), but no language mechanisms that would guarantee true deep immutability of the state. If a val variable references a mutable object, its contents can still be modified. We will have a more elaborate discussion and a deeper dive on this topic, but first let us have a look at how we can get referential immutability in Kotlin and the differences between var, val, and const val. By true deep immutability of the state, we mean a property will always return the same value whenever it is called and that the property never changes its value; we can easily avoid this if we have a val  property that has a custom getter. You can find more details at the following link: https://artemzin.com/blog/kotlin-val-does-not-mean-immutable-it-just-means-readonly-yeah/ The difference between var and val So, in order to encourage immutability but still let the developers have the choice, Kotlin introduced two types of variables. The first one is var, which is just a simple variable, just like in any imperative language. On the other hand, val brings us a bit closer to immutability; again, it doesn't guarantee immutability. So, what exactly does the val variable provide us? It enforces read-only, you cannot write into a val variable after initialization. So, if you use a val variable without a custom getter, you can achieve referential immutability. Let's have a look; the following program will not compile: fun main(args: Array<String>) { val x:String = "Kotlin" x+="Immutable"//(1) } As I mentioned earlier, the preceding program will not compile; it will give an error on comment (1). As we've declared variable x as val, x will be read-only and once we initialize x; we cannot modify it afterward. So, now you're probably asking why we cannot guarantee immutability with val ? Let's inspect this with the following example: object MutableVal { var count = 0 val myString:String = "Mutable" get() {//(1) return "$field ${++count}"//(2) } } fun main(args: Array<String>) { println("Calling 1st time ${MutableVal.myString}") println("Calling 2nd time ${MutableVal.myString}") println("Calling 3rd time ${MutableVal.myString}")//(3) } In this program, we declared myString as a val property, but implemented a custom get function, where we tweaked the value of myString before returning it. Have a look at the output first, then we will further look into the program: As you can see, the myString property, despite being val, returned different values every time we accessed it. So, now, let us look into the code to understand such behavior. On comment (1), we declared a custom getter for the val property myString. On comment (2), we pre-incremented the value of count and added it after the value of the field value, myString, and returned the same from the getter. So, whenever we requested the myString property, count got incremented and, on the next request, we got a different value. As a result, we broke the immutable behavior of a val property. Compile time constants So, how can we overcome this? How can we enforce immutability? The const val properties are here to help us. Just modify val myString with const val myString and you cannot implement the custom getter. While val properties are read-only variables, const val, on the other hand, are compile time constants. You cannot assign the outcome (result) of a function to const val. Let's discuss some of the differences between val and const val: The val properties are read-only variables, while const val are compile time constants The val properties can have custom getters, but const val cannot We can have val properties anywhere in our Kotlin code, inside functions, as a class member, anywhere, but const val has to be a top-level member of a class/object You cannot write delegates for the const val properties We can have the val property of any type, be it our custom class or any primitive data type, but only primitive data types and String are allowed with a const val property We cannot have nullable data types with the const val properties; as a result, we cannot have null values for the const val properties either As a result, the const val properties guarantee immutability of value but have lesser flexibility and you are bound to use only primitive data types with const val, which cannot always serve our purposes. Now, that I've used the word referential immutability quite a few times, let us now inspect what it means and how many types of immutability there are. Types of immutability There are basically the following two types of immutability: Referential immutability Immutable values Immutable reference  (referential immutability) Referential immutability enforces that, once a reference is assigned, it can't be assigned to something else. Think of having it as a val property of a custom class, or even MutableList or MutableMap; after you initialize the property, you cannot reference something else from that property, except the underlying value from the object. For example, take the following program: class MutableObj { var value = "" override fun toString(): String { return "MutableObj(value='$value')" } } fun main(args: Array<String>) { val mutableObj:MutableObj = MutableObj()//(1) println("MutableObj $mutableObj") mutableObj.value = "Changed"//(2) println("MutableObj $mutableObj") val list = mutableListOf("a","b","c","d","e")//(3) println(list) list.add("f")//(4) println(list) } Have a look at the output before we proceed with explaining the program: So, in this program we've two val properties—list and mutableObj. We initialized mutableObj with the default constructor of MutableObj, since it's a val property it'll always refer to that specific object; but, if you concentrate on comment (2), we changed the value property of mutableObj, as the value property of the MutableObj class is mutable (var). It's the same with the list property, we can add items to the list after initialization, changing its underlying value. Both list and mutableObj are perfect examples of immutable reference; once initialized, the properties can't be assigned to something else, but their underlying values can be changed (you can refer the output). The reason behind that is the data type we used to assign to those properties. Both the MutableObj class and the MutableList<String> data structures are mutable themselves, so we cannot restrict value changes for their instances. Immutable values The immutable values, on the other hand, enforce no change on values as well; it is really complex to maintain. In Kotlin, the const val properties enforce immutability of value, but they lack flexibility (we already discussed them) and you're bound to use only primitive types, which can be troublesome in real-life scenarios. Immutable collections Kotlin gives preference to immutability wherever possible, but leaves the choice to the developer whether or when to use it. This power of choice makes the language even more powerful. Unlike most languages, where they have either only mutable (like Java, C#, and so on) or only immutable collections (like F#, Haskell, Clojure, and so on), Kotlin has both and distinguishes between them, leaving the developer with the freedom to choose whether to use an immutable or mutable one. Kotlin has two interfaces for collection objects—Collection<out E> and MutableCollection<out E>; all the collection classes (for example, List, Set, or Map) implement either of them. As the name suggests, the two interfaces are designed to serve immutable and mutable collections respectively. Let us have an example: fun main(args: Array<String>) { val immutableList = listOf(1,2,3,4,5,6,7)//(1) println("Immutable List $immutableList") val mutableList:MutableList<Int> = immutableList.toMutableList()//(2) println("Mutable List $mutableList") mutableList.add(8)//(3) println("Mutable List after add $mutableList") println("Mutable List after add $immutableList") } The output is as follows: So, in this program, we created an immutable list with the help of the listOf method of Kotlin, on comment (1). The listOf method creates an immutable list with the elements (varargs) passed to it. This method also has a generic type parameter, which can be skipped if the elements array is not empty. The listOf method also has a mutable version—mutableListOf() which is identical except that it returns MutableList instead. We can convert an immutable list to a mutable one with the help of the toMutableList() extension function, we did the same in comment (2), to add an element to it on comment (3). However, if you check the output, the original Immutable List remains the same without any changes, the item is, however, added to the newly created MutableList instead. So now you know how to implement immutability in Kotlin. If you found this tutorial helpful, and would like to learn more, head on over to purchase the full book, Functional Kotlin, by Mario Arias and Rivu Chakraborty. Extension functions in Kotlin: everything you need to know Building RESTful web services with Kotlin Building chat application with Kotlin using Node.js, the powerful Server-side JavaScript platform
Read more
  • 0
  • 0
  • 5210

article-image-build-google-cloud-iot-application
Gebin George
27 Jun 2018
19 min read
Save for later

Build an IoT application with Google Cloud [Tutorial]

Gebin George
27 Jun 2018
19 min read
In this tutorial, we will build a sample internet of things application using Google Cloud IoT. We will start off by implementing the end-to-end solution, where we take the data from the DHT11 sensor and post it to the Google IoT Core state topic. This article is an excerpt from the book, Enterprise Internet of Things Handbook, written by Arvind Ravulavaru. End-to-end communication To get started with Google IoT Core, we need to have a Google account. If you do not have a Google account, you can create one by navigating to this URL: https://accounts.google.com/SignUp?hl=en. Once you have created your account, you can login and navigate to Google Cloud Console: https://console.cloud.google.com. Setting up a project The first thing we are going to do is create a project. If you have already worked with Google Cloud Platform and have at least one project, you will be taken to the first project in the list or you will be taken to the Getting started page. As of the time of writing this book, Google Cloud Platform has a free trial for 12 months with $300 if the offer is still available when you are reading this chapter, I would highly recommend signing up: Once you have signed up, let's get started by creating a new project. From the top menu bar, select the Select a Project dropdown and click on the plus icon to create a new project. You can fill in the details as illustrated in the following screenshot: Click on the Create button. Once the project is created, navigate to the Project and you should land on the Home page. Enabling APIs Following are the steps to be followed for enabling APIs: From the menu on the left-hand side, select APIs & Services | Library as shown in the following screenshot: On the following screen, search for pubsub and select the Pub/Sub API from the results and we should land on a page similar to the following: Click on the ENABLE button and we should now be able to use these APIs in our project. Next, we need to enable the real-time API; search for realtime and we should find something similar to the following: Click on the ENABLE & button. Enabling device registry and devices The following steps should be used for enabling device registry and devices: From the left-hand side menu, select IoT Core and we should land on the IoT Core home page: Instead of the previous screen, if you see a screen to enable APIs, please enable the required APIs from here. Click on the & Create device registry button. On the Create device registry screen, fill the details as shown in the following table: Field Value Registry ID Pi3-DHT11-Nodes Cloud region us-central1 Protocol MQTT HTTP Default telemetry topic device-events Default state topic dht11 After completing all the details, our form should look like the following: We will add the required certificates later on. Click on the Create button and a new device registry will be created. From the Pi3-DHT11-Nodes registry page, click on the Add device button and set the Device ID as Pi3-DHT11-Node or any other suitable name. Leave everything as the defaults and make sure the Device communication is set to Allowed and create a new device. On the device page, we should see a warning as highlighted in the following screenshot: Now, we are going to add a new public key. To generate a public/private key pair, we need to have OpenSSL command line available. You can download and set up OpenSSL from here: https://www.openssl.org/source/. Use the following command to generate a certificate pair at the default location on your machine: openssl req -x509 -newkey rsa:2048 -keyout rsa_private.pem -nodes -out rsa_cert.pem -subj "/CN=unused" If everything goes well, you should see an output as shown here: Do not share these certificates anywhere; anyone with these certificates can connect to Google IoT Core as a device and start publishing data. Now, once the certificates are created, we will attach them to the device we have created in IoT Core. Head back to the device page of the Google IoT Core service and under Authentication click on Add public key. On the following screen, fill it in as illustrated: The public key value is the contents of rsa_cert.pem that we generated earlier. Click on the ADD button. Now that the public key has been successfully added, we can connect to the cloud using the private key. Setting up Raspberry Pi 3 with DHT11 node Now that we have our device set up in Google IoT Core, we are going to complete the remaining operation on Raspberry Pi 3 to send data. Pre-requisites The requirements for setting up Raspberry Pi 3 on a DHT11 node are: One Raspberry Pi 3: https://www.amazon.com/Raspberry-Pi-Desktop-Starter-White/dp/B01CI58722 One breadboard: https://www.amazon.com/Solderless-Breadboard-Circuit-Circboard-Prototyping/dp/B01DDI54II/ One DHT11 sensor: https://www.amazon.com/HiLetgo-Temperature-Humidity-Arduino-Raspberry/dp/B01DKC2GQ0 Three male-to-female jumper cables: https://www.amazon.com/RGBZONE-120pcs-Multicolored-Dupont-Breadboard/dp/B01M1IEUAF/ If you are new to the world of Raspberry Pi GPIO's interfacing, take a look at this Raspberry Pi GPIO Tutorial: The Basics Explained on YouTube: https://www.youtube.com/watch?v=6PuK9fh3aL8. The following steps are to be used for the setup process: Connect the DHT11 sensor to Raspberry Pi 3 as shown in the following diagram: Next, power up Raspberry Pi 3 and log in to it. On the desktop, create a new folder named Google-IoT-Device. Open a new Terminal and cd into this folder. Setting up Node.js Refer to the following steps to install Node.js: Open a new Terminal and run the following commands: $ sudo apt update $ sudo apt full-upgrade This will upgrade all the packages that need upgrades. Next, we will install the latest version of Node.js. We will be using the Node 7.x version: $ curl -sL https://deb.nodesource.com/setup_7.x | sudo -E bash - $ sudo apt install nodejs This will take a moment to install, and once your installation is done, you should be able to run the following commands to see the version of Node.js and npm: $ node -v $ npm -v Developing the Node.js device app Now, we will set up the app and write the required code: From the Terminal, once you are inside the Google-IoT-Device folder, run the following command: $ npm init -y Next, we will install jsonwebtoken (https://www.npmjs.com/package/jsonwebtoken) and mqtt (https://www.npmjs.com/package/mqtt) from npm. Execute the following command: $ npm install jsonwebtoken mqtt--save Next, we will install rpi-dht-sensor (https://www.npmjs.com/package/rpi-dht-sensor) from npm. This module will help in reading the DHT11 temperature and humidity values: $ npm install rpi-dht-sensor --save Your final package.json file should look similar to the following code snippet: { "name": "Google-IoT-Device", "version": "1.0.0", "description": "", "main": "index.js", "scripts": { "test": "echo "Error: no test specified" && exit 1" }, "keywords": [], "author": "", "license": "ISC", "dependencies": { "jsonwebtoken": "^8.1.1", "mqtt": "^2.15.3", "rpi-dht-sensor": "^0.1.1" } } Now that we have the required dependencies installed, let's continue. Create a new file named index.js at the root of the Google-IoT-Device folder. Next, create a folder named certs at the root of the Google-IoT-Device folder and move the two certificates we created using OpenSSL there. Your final folder structure should look something like this: Open index.js in any text editor and update it as shown here: var fs = require('fs'); var jwt = require('jsonwebtoken'); var mqtt = require('mqtt'); var rpiDhtSensor = require('rpi-dht-sensor'); var dht = new rpiDhtSensor.DHT11(2); // `2` => GPIO2 var projectId = 'pi-iot-project'; var cloudRegion = 'us-central1'; var registryId = 'Pi3-DHT11-Nodes'; var deviceId = 'Pi3-DHT11-Node'; var mqttHost = 'mqtt.googleapis.com'; var mqttPort = 8883; var privateKeyFile = '../certs/rsa_private.pem'; var algorithm = 'RS256'; var messageType = 'state'; // or event var mqttClientId = 'projects/' + projectId + '/locations/' + cloudRegion + '/registries/' + registryId + '/devices/' + deviceId; var mqttTopic = '/devices/' + deviceId + '/' + messageType; var connectionArgs = { host: mqttHost, port: mqttPort, clientId: mqttClientId, username: 'unused', password: createJwt(projectId, privateKeyFile, algorithm), protocol: 'mqtts', secureProtocol: 'TLSv1_2_method' }; console.log('connecting...'); var client = mqtt.connect(connectionArgs); // Subscribe to the /devices/{device-id}/config topic to receive config updates. client.subscribe('/devices/' + deviceId + '/config'); client.on('connect', function(success) { if (success) { console.log('Client connected...'); sendData(); } else { console.log('Client not connected...'); } }); client.on('close', function() { console.log('close'); }); client.on('error', function(err) { console.log('error', err); }); client.on('message', function(topic, message, packet) { console.log(topic, 'message received: ', Buffer.from(message, 'base64').toString('ascii')); }); function createJwt(projectId, privateKeyFile, algorithm) { var token = { 'iat': parseInt(Date.now() / 1000), 'exp': parseInt(Date.now() / 1000) + 86400 * 60, // 1 day 'aud': projectId }; var privateKey = fs.readFileSync(privateKeyFile); return jwt.sign(token, privateKey, { algorithm: algorithm }); } function fetchData() { var readout = dht.read(); var temp = readout.temperature.toFixed(2); var humd = readout.humidity.toFixed(2); return { 'temp': temp, 'humd': humd, 'time': new Date().toISOString().slice(0, 19).replace('T', ' ') // https://stackoverflow.com/a/11150727/1015046 }; } function sendData() { var payload = fetchData(); payload = JSON.stringify(payload); console.log(mqttTopic, ': Publishing message:', payload); client.publish(mqttTopic, payload, { qos: 1 }); console.log('Transmitting in 30 seconds'); setTimeout(sendData, 30000); } In the previous code, we first define the projectId, cloudRegion, registryId, and deviceId based on what we have created. Next, we build the connectionArgs object, using which we are going to connect to Google IoT Core using MQTT-SN. Do note that the password property is a JSON Web Token (JWT), based on the projectId and privateKeyFile algorithm. The token that is created by this function is valid only for one day. After one day, the cloud will refuse connection to this device if the same token is used. The username value is the Common Name (CN) of the certificate we have created, which is unused. Using mqtt.connect(), we are going to connect to the Google IoT Core. And we are subscribing to the device config topic, which can be used to send device configurations when connected. Once the connection is successful, we callsendData() every 30 seconds to send data to the state topic. Save the previous file and run the following command: $ sudo node index.js And we should see something like this: As you can see from the previous Terminal logs, the device first gets connected then starts transmitting the temperature and humidity along with time. We are sending time as well, so we can save it in the BigQuery table and then build a time series chart quite easily. Now, if we head back to the Device page of Google IoT Core and navigate to the Configuration & state history tab, we should see the data that we are sending to the state topic here: Now that the device is sending data, let's actually read the data from another client. Reading the data from the device For this, you can either use the same Raspberry Pi 3 or another computer. I am going to use MacBook as a client that is interested in the data sent by the Thing. Setting up credentials Before we start reading data from Google IoT Core, we have to set up our computer (for example, MacBook) as a trusted device, so our computer can request data. Let's perform the following steps to set the credentials: To do this, we need to create a new Service account key. From the left-hand-side menu of the Google Cloud Console, select APIs & Services | Credentials. Then click on the Create credentials dropdown and select Service account key as shown in the following screenshot: Now, fill in the details as shown in the following screenshot: We have given access to the entire project for this client and as an Owner. Do not select these settings if this is a production application. Click on Create and you will be asked to download and save the file. Do not share this file; this file is as good as giving someone owner-level permissions to all assets of this project. Once the file is downloaded somewhere safe, create an environment variable with the name GOOGLE_APPLICATION_CREDENTIALS and point it to the path of the downloaded file. You can refer to Getting Started with Authentication at https://cloud.google.com/docs/authentication/getting-started if you are facing any difficulties. Setting up subscriptions The data from the device is being sent to Google IoT Core using the state topic. If you recall, we have named that topic dht11. Now, we are going to create a subscription for this topic: From the menu on the left side, select Pub/Sub | Topics. Now, click on New subscription for the dht11 topic, as shown in the following screenshot: Create a new subscription by setting up the options selected in this screenshot: We are going to use the subscription named dht11-data to get the data from the state topic. Setting up the client Now that we have provided the required credentials as well as subscribed to a Pub/Sub topic, we will set up the Pub/Sub client. Follow these steps: Create a folder named test_client inside the test_client directory. Now, run the following command: $ npm init -y Next, install the @google-cloud/pubsub (https://www.npmjs.com/package/@google-cloud/pubsub) module with the help of the following command: $ npm install @google-cloud/pubsub --save Create a file inside the test_client folder named index.js and update it as shown in this code snippet: var PubSub = require('@google-cloud/pubsub'); var projectId = 'pi-iot-project'; var stateSubscriber = 'dht11-data' // Instantiates a client var pubsub = new PubSub({ projectId: projectId, }); var subscription = pubsub.subscription('projects/' + projectId + '/subscriptions/' + stateSubscriber); var messageHandler = function(message) { console.log('Message Begin >>>>>>>>'); console.log('message.connectionId', message.connectionId); console.log('message.attributes', message.attributes); console.log('message.data', Buffer.from(message.data, 'base64').toString('ascii')); console.log('Message End >>>>>>>>>>'); // "Ack" (acknowledge receipt of) the message message.ack(); }; // Listen for new messages subscription.on('message', messageHandler); Update the projectId and stateSubscriber in the previous code. Now, save the file and run the following command: $ node index.js We should see the following output in the console: This way, any client that is interested in the data of this device can use this approach to get the latest data. With this, we conclude the section on posting data to Google IoT Core and fetching the data. In the next section, we are going to work on building a dashboard. Building a dashboard Now that we have seen how a client can read the data from our device on demand, we will move on to building a dashboard, where we display data in real time. For this, we are going to use Google Cloud Functions, Google BigQuery, and Google Data Studio. Google Cloud Functions Cloud Functions are solution for serverless services. Cloud Functions is a lightweight solution for creating standalone and single-purpose functions that respond to cloud events. You can read more about Google Cloud Functions at https://cloud.google.com/functions/. Google BigQuery Google BigQuery is an enterprise data warehouse that solves this problem by enabling super-fast SQL queries using the processing power of Google's infrastructure. You can read more about Google BigQuery at https://cloud.google.com/bigquery/. Google Data Studio Google Data Studio helps to build dashboards and reports using various data connectors, such as BigQuery or Google Analytics. You can read more about Google Data Studio at https://cloud.google.com/data-studio/. As of April 2018, these three services are still in beta. As we have already seen in the Architecture section, once the data is published on the state topic, we are going to create a cloud function that will get triggered by the data event on the Pub/Sub client. And inside our cloud function, we are going to get a copy of the published data and then insert it into the BigQuery dataset. Once the data is inserted, we are going to use Google Data Studio to create a new report by linking the BigQuery dataset to the input. So, let's get started. Setting up BigQuery The first thing we are going to do is set up BigQuery: From the side menu of the Google Cloud Platform Console, our project page, click on the BigQuery URL and we should be taken to the Google BigQuery home page. Select Create new dataset, as shown in the following screenshot: Create a new dataset with the values illustrated in the following screenshot: Once the dataset is created, click on the plus sign next to the dataset and create an empty table. We are going to name the table dht11_data and we are going have three fields in it, as shown here: Click on the Create Table button to create the table. Now that we have our table ready, we will write a cloud function to insert the incoming data from Pub/Sub into this table. Setting up Google Cloud Function Now, we are going to set up a cloud function that will be triggered by the incoming data: From the Google Cloud Console's left-hand-side menu, select Cloud Functions under Compute. Once you land on the Google Cloud Functions homepage, you will be asked to enable the cloud functions API. Click on Enable API: Once the API is enabled, we will be on the Create function page. Fill in the form as shown here: The Trigger is set to Cloud Pub/Sub topic and we have selected dht11 as the Topic. Under the Source code section; make sure you are in the index.js tab and update it as shown here: var BigQuery = require('@google-cloud/bigquery'); var projectId = 'pi-iot-project'; var bigquery = new BigQuery({ projectId: projectId, }); var datasetName = 'pi3_dht11_dataset'; var tableName = 'dht11_data'; exports.pubsubToBQ = function(event, callback) { var msg = event.data; var data = JSON.parse(Buffer.from(msg.data, 'base64').toString()); // console.log(data); bigquery .dataset(datasetName) .table(tableName) .insert(data) .then(function() { console.log('Inserted rows'); callback(); // task done }) .catch(function(err) { if (err && err.name === 'PartialFailureError') { if (err.errors && err.errors.length > 0) { console.log('Insert errors:'); err.errors.forEach(function(err) { console.error(err); }); } } else { console.error('ERROR:', err); } callback(); // task done }); }; In the previous code, we were using the BigQuery Node.js module to insert data into our BigQuery table. Update projectId, datasetName, and tableName as applicable in the code. Next, click on the package.json tab and update it as shown: { "name": "cloud_function", "version": "0.0.1", "dependencies": { "@google-cloud/bigquery": "^1.0.0" } } Finally, for the Function to execute field, enter pubsubToBQ. pubsubToBQ is the name of the function that has our logic and this function will be called when the data event occurs. Click on the Create button and our function should be deployed in a minute. Running the device Now that the entire setup is done, we will start pumping data into BigQuery: Head back to Raspberry Pi 3 which was sending the DHT11 temperature and humidity data, and run the application. We should see the data being published to the state topic: Now, if we head back to the Cloud Functions page, we should see the requests coming into the cloud function: You can click on VIEW LOGS to view the logs of each function execution: Now, head over to our table in BigQuery and click on the RUN QUERY button; run the query as shown in the following screenshot: Now, all the data that was generated by the DHT11 sensor is timestamped and stored in BigQuery. You can use the Save to Google Sheets button to save this data to Google Sheets and analyze the data there or plot graphs, as shown here: Or we can go one step ahead and use the Google Data Studio to do the same. Google Data Studio reports Now that the data is ready in BigQuery, we are going to set up Google Data Studio and then connect both of them, so we can access the data from BigQuery in Google Data Studio: Navigate to https://datastudio.google.com and log in with your Google account. Once you are on the Home page of Google Data Studio, click on the Blank report template. Make sure you read and agree to the terms and conditions before proceeding. Name the report PI3 DHT11 Sensor Data. Using the Create new data source button, we will create a new data source. Click on Create new data source and we should land on a page where we need to create a new Data Source. From the list of Connectors, select BigQuery; you will be asked to authorize Data Studio to interface with BigQuery, as shown in the following screenshot: Once we authorized, we will be shown our projects and related datasets and tables: Select the dht11_data table and click on Connect. This fetches the metadata of the table as shown here: Set the Aggregation for the temp and humd fields to Max and set the Type for time as Date & Time. Pick Minute (mm) from the sub-list. Click on Add to report and you will be asked to authorize Google Data Studio to read data from the table. Once the data source has been successfully linked, we will create a new time series chart. From the menu, select Insert | Time Series link. Update the data configuration of the chart as shown in the following screenshot: You can play with the styles as per your preference and we should see something similar to the following screenshot: This report can then be shared with any user. With this, we have seen the basic features and implementation process needed to work with Google Cloud IoT Core as well other features of the platform. If you found this post useful, do check out the book,  Enterprise Internet of Things Handbook, to build state of the art IoT applications best-fit for Enterprises. Cognitive IoT: How Artificial Intelligence is remoulding Industrial and Consumer IoT Five Most Surprising Applications of IoT How IoT is going to change tech teams
Read more
  • 0
  • 11
  • 22801

article-image-data-model-in-splunk-to-enable-interactive-reports-and-dashboards
Pravin Dhandre
26 Jun 2018
8 min read
Save for later

Create a data model in Splunk to enable interactive reports and dashboards

Pravin Dhandre
26 Jun 2018
8 min read
Data models enable you to create Splunk reports and dashboards without having to develop Splunk search. Typically, data models are designed by those that understand the specifics around the format, the semantics of certain data, and the manner in which users may expect to work with that data. In building a typical data model, knowledge managers use knowledge object types (such as lookups, transactions, search-time field extractions, and calculated fields). Today we are going to learn how to create a Splunk data model and how to describe that model with various fields and lookup attributes. This article is an excerpt from a book written by James D. Miller titled Implementing Splunk 7 - Third Edition.  Creating a data model So now that we have a general idea of what a Splunk data model is, let's go ahead and create one. Before we can get started, we need to verify that our user ID is set up with the proper access required to create a data model. By default, only users with an admin or power role can create data models. For other users, the ability to create a data model depends on whether their roles have write access to an app. To begin (once you have verified that you do have access to create a data model), you can click on Settings and then on Data models (under KNOWLEDGE): This takes you to the Data Models (management) page, shown in the next screenshot. This is where a list of data models is displayed. From here, you can manage permissions, acceleration, cloning, and removal of existing data models. You can also use this page to upload a data model or create new data models, using the Upload Data Model and New Data Model buttons on the upper-right corner, respectively. Since this is a new data model, you can click on the button labeled New Data Model. This will open the New Data Model dialog box (shown in the following image). We can fill in the required information in this dialog box: Filling in the new data model dialog You have four fields to fill in order to describe your new Splunk data model (Title, ID, App, and Description): Title: Here you must enter a Title for your data model. This field accepts any character, as well as spaces. The value you enter here is what will appear on the data model listing page. ID: This is an optional field. It gets prepopulated with what you entered for your data model title (with any spaces replaced with underscores. Take a minute to make sure you have a good one, since once you enter the data model ID, you can't change it. App: Here you select (from a drop-down list) the Splunk app that your data model will serve. Description: The description is also an optional field, but I recommend adding something descriptive to later identify your data model. Once you have filled in these fields, you can click on the button labeled Create. This opens the data model (in our example, Aviation Games) in the Splunk Edit Objects page as shown in the following screenshot: The next step in defining a data model is to add the first object. As we have already stated, data models are typically composed of object hierarchies built on root event objects. Each root event object represents a set of data that is defined by a constraint, which is a simple search that filters out events that are not relevant to the object. Getting back to our example, let's create an object for our data model to track purchase requests on our Aviation Games website. To define our first event-based object, click on Add Dataset (as shown in the following screenshot): Our data model's first object can either be a Root Event, or Root Search. We're going to add a Root Event, so select Root Event. This will take you to the Add Event Dataset editor: Our example event will expose events that contain the phrase error, which represents processing errors that have occurred within our data source. So, for Dataset Name, we will enter Processing Errors. The Dataset ID will automatically populate when you type in the Dataset Name (you can edit it if you want to change it). For our object's constraint, we'll enter sourcetype=tm1* error. This constraint defines the events that will be reported on (all events that contain the phrase error that are indexed in the data sources starting with tml). After providing Constraints for the event-based object, you can click on Preview to test whether the constraints you've supplied return the kind of events that you want. The following screenshot depicts the preview of the constraints given in this example: After reviewing the output, click on Save. The list of attributes for our root object is displayed: host, source, sourcetype, and _time. If you want to add child objects to client and server errors, you need to edit the attributes list to include additional attributes: Editing fields (attributes) Let's add an auto-extracted attribute, as mentioned earlier in this chapter, to our data model. Remember, auto-extracted attributes are derived by Splunk at search time. To start, click on Add Field: Next, select Auto-Extracted. The Add Auto-Extracted Field window opens: You can scroll through the list of automatically extracted fields and check the fields that you want to include. Since my data model example deals with errors that occurred, I've selected date_mday, date_month, and date_year. Notice that to the right of the field list, you have the opportunity to rename and type set each of the fields that you selected. Rename is self-explanatory, but for Type, Splunk allows you to select String, Number, Boolean, or IPV$ and indicate if the attribute is Required, Optional, Hidden, or Hidden &amp; Required. Optional means that the attribute doesn't have to appear in every event represented by the object. The attribute may appear in some of the object events and not others. Once you have reviewed your selected field types, click on Save: Lookup attributes Let's discuss lookup attributes now. Splunk can use the existing lookup definitions to match the values of an attribute that you select to values of a field in the specified lookup table. It then returns the corresponding field/value combinations and applies them to your object as (lookup) attributes. Once again, if you click on Add Field and select Lookup, Splunk opens the Add Fields with a Lookup page (shown in the following screenshot) where you can select from your currently defined lookup definitions. For this example, we select dnslookup: The dnslookup converts clienthost to clientip. We can configure a lookup attribute using this lookup to add that result to the processing errors objects. Under Input, select clienthost for Field in Lookup and Field in Dataset. Field in Lookup is the field to be used in the lookup table. Field in Dataset is the name of the field used in the event data. In our simple example, Splunk will match the field clienthost with the field host: Under Output, I have selected host as the output field to be matched with the lookup. You can provide a Display Name for the selected field. This display name is the name used for the field in your events. I simply typed AviationLookupName for my display name (see the following screenshot): Again, Splunk allows you to click on Preview to review the fields that you want to add. You can use the tabs to view the Events in a table, or view the values of each of the fields that you selected in Output. For example, the following screenshot shows the values of AviationLookupName: Finally, we can click on Save: Add Child object to our model We have just added a root (or parent) object to our data model. The next step is to add some children. Although a child object inherits all the constraints and attributes from its parent, when you create a child, you will give it additional constraints with the intention of further filtering the dataset that the object represents. To add a child object to our data model, click on Add Field and select Child: Splunk then opens the editor window, Add Child Dataset (shown in the following screenshot): On this page, follow these steps: Enter the Object Name: Dimensional Errors. Leave the Object ID as it is: Dimensional_Errors. Under Inherit From, select Processing Errors. This means that this child object will inherit all the attributes from the parent object, Processing Errors. Add the Additional Constraints, dimension, which means that the data models search for the events in this object; when expanded, it will look something like sourcetype=tm1* error dimension. Finally, click on Save to save your changes: Following the previously outlined steps, you can add more objects, each continuing to filter the results until you have the results that you need. With this we learned to create data models, and manage permissions, cloning and accelerating operational data models with ease. If you found this tutorial useful, do check out the book Implementing Splunk 7 - Third Edition and start transforming machine-generated data into valuable and actionable business insights. How to use R to boost your Data Model Building a Microsoft Power BI Data Model Splunk’s Input Methods and Data Feeds
Read more
  • 0
  • 0
  • 7377

article-image-set-up-scala-plugin-for-intellij-ide
Pavan Ramchandani
26 Jun 2018
2 min read
Save for later

How to set up the Scala Plugin in IntelliJ IDE [Tutorial]

Pavan Ramchandani
26 Jun 2018
2 min read
The Scala Plugin is used to turn a normal IntelliJ IDEA into a convenient Scala development environment. In this article, we will discuss how to set up Scala Plugin for IntelliJ IDEA IDE.  If you do not have IntelliJ IDEA, you can download it from here. By default, IntelliJ IDEA does not come with Scala features. Scala Plugin adds Scala features means that we can create Scala/Play Projects, we can create Scala Applications, Scala worksheets, and more. Scala Plugin contains the following technologies: Scala Play Framework SBT Scala.js It supports three popular OS Environments: Windows, Mac, and Linux. Setting up Scala Plugin for IntelliJ IDE Perform the  following steps to install Scala Plugin for IntelliJ IDE to develop our Scala-based projects: Open IntelliJ IDE: Go to  Configure at the bottom right and click on the Plugins option available in the drop-down, as shown here: This opens the Plugins window as shown here: Now click on InstallJetbrainsplugins, as shown in the preceding screenshot. Next, type the word Scala in the search bar to see the ScalaPlugin, as shown here: Click on the Install button to install Scala Plugin for IntelliJ IDEA. Now restart IntelliJ IDEA to see that Scala Plugin features. After we re-open IntelliJ IDEA, if we try to access File | New Project option, we will see Scala option in New Project window as shown in the following screenshot to create new Scala or Play Framework-based SBT projects: We can see the Play Framework option only in the IntelliJ IDEA Ultimate Edition. As we are using CE (Community Edition), we cannot see that option. It's now time to start Scala/Play application development using the IntelliJ IDE. You can start developing some Scala/Play-based applications. To summarize, we got an understanding to Scala Plugin and covered the installation steps for Scala Plugin for IntelliJ. To learn more about solutions for taking reactive programming approach with Scala, please refer the book Scala Reactive Programming. What Scala 3.0 Roadmap looks like! Building Scalable Microservices Exploring Scala Performance
Read more
  • 0
  • 0
  • 10406
article-image-interact-with-hbase-using-hbase-shell-tutorial
Amey Varangaonkar
25 Jun 2018
9 min read
Save for later

How to interact with HBase using HBase shell [Tutorial]

Amey Varangaonkar
25 Jun 2018
9 min read
HBase is among the top five most popular and widely-deployed NoSQL databases. It is used to support critical production workloads across hundreds of organizations. It is supported by multiple vendors (in fact, it is one of the few databases that is multi-vendor), and more importantly has an active and diverse developer and user community. In this article, we see how to work with the HBase shell in order to efficiently work on the massive amounts of data. The following excerpt is taken from the book '7 NoSQL Databases in a Week' authored by Aaron Ploetz et al. Working with the HBase shell The best way to get started with understanding HBase is through the HBase shell. Before we do that, we need to first install HBase. An easy way to get started is to use the Hortonworks sandbox. You can download the sandbox for free from https://hortonworks.com/products/sandbox/. The sandbox can be installed on Linux, Mac and Windows. Follow the instructions to get this set up. On any cluster where the HBase client or server is installed, type hbase shell to get a prompt into HBase: hbase(main):004:0> version 1.1.2.2.3.6.2-3, r2873b074585fce900c3f9592ae16fdd2d4d3a446, Thu Aug 4 18:41:44 UTC 2016 This tells you the version of HBase that is running on the cluster. In this instance, the HBase version is 1.1.2, provided by a particular Hadoop distribution, in this case HDP 2.3.6: hbase(main):001:0> help HBase Shell, version 1.1.2.2.3.6.2-3, r2873b074585fce900c3f9592ae16fdd2d4d3a446, Thu Aug 4 18:41:44 UTC 2016 Type 'help "COMMAND"', (e.g. 'help "get"' -- the quotes are necessary) for help on a specific command. Commands are grouped. Type 'help "COMMAND_GROUP"', (e.g. 'help "general"') for help on a command group. This provides the set of operations that are possible through the HBase shell, which includes DDL, DML, and admin operations. hbase(main):001:0> create 'sensor_telemetry', 'metrics' 0 row(s) in 1.7250 seconds => Hbase::Table - sensor_telemetry This creates a table called sensor_telemetry, with a single column family called metrics. As we discussed before, HBase doesn't require column names to be defined in the table schema (and in fact, has no provision for you to be able to do so): hbase(main):001:0> describe 'sensor_telemetry' Table sensor_telemetry is ENABLED sensor_telemetry COLUMN FAMILIES DESCRIPTION {NAME => 'metrics', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', COMPRESSION => 'NONE', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE =>'0'} 1 row(s) in 0.5030 seconds This describes the structure of the sensor_telemetry table. The command output indicates that there's a single column family present called metrics, with various attributes defined on it. BLOOMFILTER indicates the type of bloom filter defined for the table, which can either be a bloom filter of the ROW type, which probes for the presence/absence of a given row key, or of the ROWCOL type, which probes for the presence/absence of a given row key, col-qualifier combination. You can also choose to have BLOOMFILTER set to None. The BLOCKSIZE configures the minimum granularity of an HBase read. By default, the block size is 64 KB, so if the average cells are less than 64 KB, and there's not much locality of reference, you can lower your block size to ensure there's not more I/O than necessary, and more importantly, that your block cache isn't wasted on data that is not needed. VERSIONS refers to the maximum number of cell versions that are to be kept around: hbase(main):004:0> alter 'sensor_telemetry', {NAME => 'metrics', BLOCKSIZE => '16384', COMPRESSION => 'SNAPPY'} Updating all regions with the new schema... 1/1 regions updated. Done. 0 row(s) in 1.9660 seconds Here, we are altering the table and column family definition to change the BLOCKSIZE to be 16 K and the COMPRESSION codec to be SNAPPY: hbase(main):004:0> version 1.1.2.2.3.6.2-3, r2873b074585fce900c3f9592ae16fdd2d4d3a446, Thu Aug 4 18:41:44 UTC 2016 hbase(main):005:0> describe 'sensor_telemetry' Table sensor_telemetry is ENABLED sensor_telemetry COLUMN FAMILIES DESCRIPTION {NAME => 'metrics', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', COMPRESSION => 'SNAPPY', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '16384', REPLICATION_SCOPE => '0'} 1 row(s) in 0.0410 seconds This is what the table definition now looks like after our ALTER table statement. Next, let's scan the table to see what it contains: hbase(main):007:0> scan 'sensor_telemetry' ROW COLUMN+CELL 0 row(s) in 0.0750 seconds No surprises, the table is empty. So, let's populate some data into the table: hbase(main):007:0> put 'sensor_telemetry', '/94555/20170308/18:30', 'temperature', '65' ERROR: Unknown column family! Valid column names: metrics:* Here, we are attempting to insert data into the sensor_telemetry table. We are attempting to store the value '65' for the column qualifier 'temperature' for a row key '/94555/20170308/18:30'. This is unsuccessful because the column 'temperature' is not associated with any column family. In HBase, you always need the row key, the column family and the column qualifier to uniquely specify a value. So, let's try this again: hbase(main):008:0> put 'sensor_telemetry', '/94555/20170308/18:30', 'metrics:temperature', '65' 0 row(s) in 0.0120 seconds Ok, that seemed to be successful. Let's confirm that we now have some data in the table: hbase(main):009:0> count 'sensor_telemetry' 1 row(s) in 0.0620 seconds => 1 Ok, it looks like we are on the right track. Let's scan the table to see what it contains: hbase(main):010:0> scan 'sensor_telemetry' ROW COLUMN+CELL /94555/20170308/18:30 column=metrics:temperature, timestamp=1501810397402,value=65 1 row(s) in 0.0190 seconds This tells us we've got data for a single row and a single column. The insert time epoch in milliseconds was 1501810397402. In addition to a scan operation, which scans through all of the rows in the table, HBase also provides a get operation, where you can retrieve data for one or more rows, if you know the keys: hbase(main):011:0> get 'sensor_telemetry', '/94555/20170308/18:30' COLUMN CELL metrics:temperature timestamp=1501810397402, value=65 OK, that returns the row as expected. Next, let's look at the effect of cell versions. As we've discussed before, a value in HBase is defined by a combination of Row-key, Col-family, Col-qualifier, Timestamp. To understand this, let's insert the value '66', for the same row key and column qualifier as before: hbase(main):012:0> put 'sensor_telemetry', '/94555/20170308/18:30', 'metrics:temperature', '66' 0 row(s) in 0.0080 seconds Now let's read the value for the row key back: hbase(main):013:0> get 'sensor_telemetry', '/94555/20170308/18:30' COLUMN CELL metrics:temperature timestamp=1501810496459, value=66 1 row(s) in 0.0130 seconds This is in line with what we expect, and this is the standard behavior we'd expect from any database. A put in HBase is the equivalent to an upsert in an RDBMS. Like an upsert, put inserts a value if it doesn't already exist and updates it if a prior value exists. Now, this is where things get interesting. The get operation in HBase allows us to retrieve data associated with a particular timestamp: hbase(main):015:0> get 'sensor_telemetry', '/94555/20170308/18:30', {COLUMN => 'metrics:temperature', TIMESTAMP => 1501810397402} COLUMN CELL metrics:temperature timestamp=1501810397402,value=65 1 row(s) in 0.0120 seconds   We are able to retrieve the old value of 65 by providing the right timestamp. So, puts in HBase don't overwrite the old value, they merely hide it; we can always retrieve the old values by providing the timestamps. Now, let's insert more data into the table: hbase(main):028:0> put 'sensor_telemetry', '/94555/20170307/18:30', 'metrics:temperature', '43' 0 row(s) in 0.0080 seconds hbase(main):029:0> put 'sensor_telemetry', '/94555/20170306/18:30', 'metrics:temperature', '33' 0 row(s) in 0.0070 seconds Now, let's scan the table back: hbase(main):030:0> scan 'sensor_telemetry' ROW COLUMN+CELL /94555/20170306/18:30 column=metrics:temperature, timestamp=1501810843956, value=33 /94555/20170307/18:30 column=metrics:temperature, timestamp=1501810835262, value=43 /94555/20170308/18:30 column=metrics:temperature, timestamp=1501810615941,value=67 3 row(s) in 0.0310 seconds We can also scan the table in reverse key order: hbase(main):031:0> scan 'sensor_telemetry', {REVERSED => true} ROW COLUMN+CELL /94555/20170308/18:30 column=metrics:temperature, timestamp=1501810615941, value=67 /94555/20170307/18:30 column=metrics:temperature, timestamp=1501810835262, value=43 /94555/20170306/18:30 column=metrics:temperature, timestamp=1501810843956,value=33 3 row(s) in 0.0520 seconds What if we wanted all the rows, but in addition, wanted all the cell versions from each row? We can easily retrieve that: hbase(main):032:0> scan 'sensor_telemetry', {RAW => true, VERSIONS => 10} ROW COLUMN+CELL /94555/20170306/18:30 column=metrics:temperature, timestamp=1501810843956, value=33 /94555/20170307/18:30 column=metrics:temperature, timestamp=1501810835262, value=43 /94555/20170308/18:30 column=metrics:temperature, timestamp=1501810615941, value=67 /94555/20170308/18:30 column=metrics:temperature, timestamp=1501810496459, value=66 /94555/20170308/18:30 column=metrics:temperature, timestamp=1501810397402, value=65 Here, we are retrieving all three values of the row key /94555/20170308/18:30 in the scan result set. HBase scan operations don't need to go from the beginning to the end of the table; you can optionally specify the row to start scanning from and the row to stop the scan operation at: hbase(main):034:0> scan 'sensor_telemetry', {STARTROW => '/94555/20170307'} ROW COLUMN+CELL /94555/20170307/18:30 column=metrics:temperature, timestamp=1501810835262, value=43 /94555/20170308/18:30 column=metrics:temperature, timestamp=1501810615941, value=67 2 row(s) in 0.0550 seconds HBase also provides the ability to supply filters to the scan operation to restrict what rows are returned by the scan operation. It's possible to implement your own filters, but there's rarely a need to. There's a large collection of filters that are already implemented: hbase(main):033:0> scan 'sensor_telemetry', {ROWPREFIXFILTER => '/94555/20170307'} ROW COLUMN+CELL /94555/20170307/18:30 column=metrics:temperature, timestamp=1501810835262, value=43 1 row(s) in 0.0300 seconds This returns all the rows whose keys have the prefix /94555/20170307: hbase(main):033:0> scan 'sensor_telemetry', { FILTER => SingleColumnValueFilter.new( Bytes.toBytes('metrics'), Bytes.toBytes('temperature'), CompareFilter::CompareOp.valueOf('EQUAL'), BinaryComparator.new(Bytes.toBytes('66')))} The SingleColumnValueFilter can be used to scan a table and look for all rows with a given column value. We saw how fairly easy it is to work with your data in HBase using the HBase shell. If you found this excerpt useful, make sure you check out the book 'Seven NoSQL Databases in a Week', to get more hands-on information about HBase and the other popular NoSQL databases out there today. Read More Level Up Your Company’s Big Data with Mesos 2018 is the year of graph databases. Here’s why. Top 5 NoSQL Databases
Read more
  • 0
  • 0
  • 4874

article-image-unity-assets-to-create-interactive-2d-games
Amarabha Banerjee
25 Jun 2018
20 min read
Save for later

Unity assets to create interactive 2D games [Tutorial]

Amarabha Banerjee
25 Jun 2018
20 min read
Unity assets are part of the unity ecosystem which help you to create in-game environments, gameplay options effectively. In this article, we are going to show you how to work with Unity assets which will eventually help you create fun and interactive 2D games with Unity 2017. This article is a part of the book titled "Unity 2017 2D Game Development Projects" written by Lauren S. Ferro & Francesco Sapio. This book helps you to create exciting 2D games from scratch easily. Textures and Sprites Before you start anything within Unity, it is useful to know that Textures and Sprites within Unity are two separate things, although they are used in similar contexts. To begin, a Sprite is an image that can be used as a 2D object. It has only two coordinates: x-axis and y-axis. Therefore, all the graphical components of 2D game development are called Sprites. Sprites can be repositioned, scaled, and rotated like any other game object in Unity. You can move, destroy, or create it during the game. Sprites, by default, are rendered directly against the camera; however, you can easily change this if you are using the Sprite Renderer in a 3D scene. They work with the Sprite Renderer, unlike a 3D object, which works with the Mesh Renderer. Besides Sprites, there are other graphical components called Textures. These are also images, but they are used to change the appearance of an object in both 2D (for example, Sprites and background) and 3D (for example, an object or character's appearance). But Textures are not objects. This means that you cannot get them to move during gameplay. Saying that, you can create images with Textures that animate, with Sprite Sheets/Atlases. What this means is that each frame of an animation is placed on a Sprite Sheet, which is a Texture, that will eventually be cut up so that each frame of the animation is played sequentially. Throughout we will use the terms Sprite Sheets and Atlases. While they are pretty much the same thing, the subtle difference between the two is that a Sprite Sheet generally has Sprite (frame-by-frame) animations, whereas an Atlas will contain images such as tileable Textures for the walls and other environmental components (for example, objects). Their purpose is to maximize the space by combining multiple images into one Texture, whether for characters (and their animations) or environmental Textures. More generally speaking, when it comes to handling Sprites and Textures, Unity has various tools that deal with them in different ways and are used for different purposes. A brief overview of each of them follows. We will discuss them in more detail: Sprite Editor: This is used to edit Sprites. This is done by selecting them individually from a larger image, known as a Sprite Atlas, or by changing their Pivot point, and so on. Sprite Creator: This is used to create a Sprite placeholder. This is useful if you do not have any Sprites to use but want to continue implementing the functionality of a game. Sprite placeholders can be replaced later with actual Sprites. Sprite Packer: This is used to increase the efficiency of your project's usage of main memory. It achieves this by packing various Sprites into a single place using the Packing Tag attribute. This appears in the Inspector window when you select a Sprite in the Project window. Sprite Render The Sprite Render displays images that have been imported as the type Sprite. There are a number of different parameters within the Sprite Render that allows you to modify a Sprite. We will discuss them here: Color: Color allows you to change the color value and the value of the Alpha channel (transparency) of a Sprite Flip: Flip is what defines the axis that the Sprite needs to be flipped on Material: Material refers to the material that Unity will use to render the Sprite Sorting Layer: Sorting Layer defines which layer the Sprite should be rendered on (it basically indicates the order in which the different Sprites are drawn, for example, which one is on top of the others) Order in Layer: Order in Layer is the order within the Sorting Layer Sprite Editor In some cases, you may have a Texture that contains just one graphic element; in other cases, you may have multiple ones. The latter is more effective for many reasons, such as saving computational resources and keeping things organized. A case in which you are likely to combine many Sprites into one Texture may be frame-by-frame animations of a character, where other Sprites may be parts of a character (such as clothing and items), and will need to be customizable, such as different items (and their effects). In Unity, you can easily extract elements from a single Texture by using the Sprite Editor. The Sprite Editor is used to take multiple elements from an Atlas or Sprite Sheet and slice them into individual Sprites. How to use the Sprite Editor To open the Sprite Editor, perform the following steps: Drag and drop some images (anything you have on your computer, so you can have them as test images) into the Project panel. Select the 2D image you want to edit from the Project view. In the Inspector, change the Texture Type into Sprite (2D and UI), so you will be able to use it within the Sprite Editor. Click on the Sprite Editor button in the Texture Import Inspector and the Sprite Editor displays. When you open the Sprite Editor window, you can move it around like any other window within Unity; you can also dock next to others such as the Hierarchy or Project windows. To select the Sprites, simply click and drag on the Sprite that you wish to select. As a result, you will have bounding boxes around each Sprite element that you have selected, as in the following screenshot: If you happen to click and drag too much around a Sprite, don't panic! You can easily resize the bounding box by clicking on any of the four corners or edges of the bounding box, like in the upcoming screenshot. Alternatively, you can also reposition the bounding box by clicking and dragging in the middle of the box itself. While you're creating these selections, it is important to make sure that you name them appropriately. To do this, click on the box surrounding the Sprite that you wish to name. You will notice that a box appears. Now, next to where it says Name is where you enter the name that you wish to call your Sprite. Another thing that is also to keep in mind here is the Pivot of the Sprite. Think of this as the Sprite's center. For example, if you rotate a Sprite, it will rotate wherever its Pivot is .0. A few more elements that you will also find useful while you are slicing up your Sprites are the options located at the top of the Sprite Editor window. We will discuss them now. You can only see the Sprite Editor button if the TextureType on the image you have selected is set to Sprite (2D and UI). In addition, you cannot edit a Sprite which is in the Scene view. Slice Menu: One great feature of Unity is the opportunity to automatically slice Sprites. What this means is that if you have a large Sprite sheet with various animations, images, and so on, you can automatically cut each image out. You have two options to do this: Automatic: Automatic is better for when you have unevenly distributed Sprites, such as the case with an Atlas.  When choosing the location of the Pivot, it will, by default set it to the center. Method: Method tells you how to deal with existing Sprites within the Sprite Editor window. For example, if you select Delete Existing, it replaces any Sprites that exist (with the same name) with new Sprites; Smart will try to create new Sprites while at the same time adjusting existing ones, and Safe will add new Sprites without changing any that currently exist. The Grid is better for when you have Sprites that are evenly distributed, such as frame-by-frame animations. In these cases, it is not recommended to use Automatic because the size differences between each Sprite may cause unintended effects in terms of how they appear within the game, such as the Pivot being in the wrong location, resulting in an inaccurate animation. An example of the Grid menu is shown in the following screenshot. Pixel Size sets the size of the Grid in the unit of Pixels. This number will be determined based on the size of your Sprite Sheet and distribution of Sprites: Sprite Packer Using the Sprite Packer, you can combine multiple elements such as large sets of Sprites into a single Texture known as an Atlas. However, before using it, we must first make sure that it is enabled within Unity. To do this, go to Edit | Project Settings | Editor. Once you have done this, look at the Inspector; you can change the Sprite Packer from disabled to Always Enabled or vice versa. You can see an example of this in the following screenshot. By selecting Always Enabled. The Sprite Packer will always be enabled whenever you start a new project. That way, you will not need to worry about enabling it again: One of the benefits of using this is that it can boost the performance of your game by reducing the number of Draw Calls each frame. This is because a significant portion of a Sprite Texture will often be taken up by the empty space between the graphic elements. As a result, it wastes video memory at runtime because it needs to compute this empty space even if there is nothing there. By keeping this in mind, when you are creating your own Sprites, try to pack graphics from several Sprite Textures together and as close as possible to one another within an Atlas. Lastly, keep in mind that depending on the sizes of your Sprites, an Atlas should not be larger than 2048 x 2048 or 211 (or at least, this guarantees compatibility with many devices). Unity handles the generation and use of Sprite Atlas Textures behind the scenes so that the user does not need to do any manual assignment. The Atlas can optionally be packed on entering Play mode or during a build, and the graphics for a sprite object will be obtained from the Atlas once it is generated. Users are required to specify a Packing Tag in the Texture Importer to enable packing for Sprites of that Texture. To use the Sprite packer, simply go to the top navigation menu and select Window | Sprite Packer. Once you have done this, it will open the Sprite Packer. Sprite Creator is your friend when you have no assets While we have Sprites, in this case, you might not always have them. If you don't have Sprites, you can always add placeholders or images in the place of where they are likely to be. This is a useful thing to use when you're prototyping an idea and you need to get functionality working before your images are ready to go. Using the Sprite Creator is quite simple. We can create a placeholder Sprite by doing the following: First, select Assets | Create | Sprites. Next, select the placeholder Sprite you want to make, like in the following screenshot. Unity offers only six different placeholder Sprites: Square, Circle, Triangle, Diamond, Hexagon, and Polygon. Before creating the Sprite, it is important to make sure that you select the folder that you want the Sprite to be created in. This just saves time later from having to move it to the correct folder. This is because, when creating a Sprite with the Sprite Creator, it will automatically place it in the Asset folder that you currently have open in the Project Window. Lastly, from the list, select the placeholder Sprite that you wish to use: Once you have chosen your Sprite, it will appear as a white shape. The Texture created by Unity will use the .png file format and contain default image data of 4x4 white pixels. At this stage, the new Sprite will have a default name based on its shapes, such as Square or Circle. You have the option to rename it, but if you don't change it, don't worry, as each additional Sprite that is the same shape will simply have a number following its name. You can, of course, always change the name of the Sprite later by clicking on it in the Asset folder where it is located: Once your new Sprite has been created, simply drag and drop your placeholder Sprite into the Scene view or Hierarchy to start using it in your project. An example of this can be seen in the following screenshot: Once you're done, whether it is a mock-up, prototype, or something else, you may want to change the placeholder Sprite to the actual image. Once you have imported the new image(s), simply do the following: Click on the Sprite within the Scene view so that it is selected. Now, in the Inspector, locate Sprite Renderer Component. An example of this is shown in the following screenshot: Now, where it says Sprite, click on the small circle located next to the Sprite name, in this case, Hexagon. This is highlighted in the following screenshot: Now, a small window will be displayed, like in the following screenshot: The Sprite Creator makes 4x4 white PNG outline Textures, which is a power of 2-sized Texture that is actually generated by an algorithm. Setting up the Angel Cakes project Now we're going to discuss how to set up our first project! For the rest, we're going to discuss how to import the assets for the Angel Cakes project into Unity and get the project ready to go. We'll cover the process for importing and setting up while getting you familiar with 2D assets. To begin, let's get the Angel Cakes asset pack, which is featured in the following screenshot: To download the assets, simply visit www.player26.com/product/Angelcakes and download the .zip file. Once you have finished downloading it, simply unzip the file with a program such as WinRAR. Folder setup You need to make sure that you have some foundational folders created to use with your project. To briefly recap, have a look at the following screenshot. Remember that the Assets folder is always the root or parent folder for the project files: Importing assets into the engine With your folders set up, we now begin to import some images for our project: the background, the player, an enemy, player collision (wall, objects), and collectables (Angel Cakes, health, and bonuses). Importing the assets into Unity is easy. First, click on the folder that you want the Sprites to be imported into, inside the Project window; for this project, we will use the folder titled Sprites Next, in the top menu, click Assets | Import New Assets and navigate to the folder that they are located in Once you have found them, select them and then click Import Once they are imported, they will appear in the folder, like in the following screenshot: Configuring assets for the game The assets used in this game do not need much configuring, in comparison to the ones that we will use later. Once you have imported the two Sprites into Unity, do the following: Select each one within the Project window. Now, in the Inspector, change the Sprite Mode to Multiple. This is because we have multiple images of each Texture. One is an Atlas (the environmental objects) and one is a Sprite Sheet (character animations). Once you have done this, click Apply: Once you have changed the Sprite Mode to Multiple, click Sprite Editor. Now you should see something like the following screenshot: First, click on Slice and select Grid By Cell Size Next, in Pixel Size, change the values of X and Y to 50, like in the following screenshot, then click Slice: Now, if you hold down Ctrl (or command on a Mac) you will see all the freshly cut slices, like in the following screenshot: If you click on each slice, you will notice that a Sprite information box will appear, like in the following screenshot: In this information box, you can rename the Sprite to whatever you would like. Each Sprite has been given a number so that you can understand the corresponding name conventions that are described following screenshot: For this project, we will call each Sprite set the following: Numbers 1-6: ACSpriteChar1...2...3...4... Numbers 7 - 12: ACSpriteCharEvo1...2...3...4... Numbers 13 - 18: ACSpriteCharEnemie1...2...3...4... Number 19: Delete Once you have done this, you can now see all your Sprites within the Project window. To do this, simply click on the triangle that is highlighted in the following screenshot: Once you have clicked this, it will expand, revealing all of your Sprites and their names, like in the following screenshot: There are many things that we will now be able to do with these images, such as animations. The next thing that we need to do now is slice up the environment Atlas. Locate the Sprite file within the Project window and open it up in the Sprite Editor. Remember that you need to change the Sprite type to Multiple in the Inspector, otherwise you will not be able to edit the Sprite. Once you have it in the Sprite Editor, it should look something like the following: This time, instead of selecting the Slice type Grid By Cell Size, we will do it manually. This is because if we choose to do it via the type Automatic, we will find that there are other slices, like those on the clouds on the right of the following screenshot. This can be tedious when there are lots of little parts of a single Sprite, such as the Clouds: So, for now, manually drag and click around each of the Sprites, making sure that you get as close to the edges as possible. You may find that you will need to zoom in on some parts (by using the mouse scroll wheel), like the Angel Cakes. Also, the options in the top-right corner might help you by filtering the image (for example, black and white). As you begin refining the bounding box, you will feel the outline pull or snap toward the edges of the Sprite; this helps you to get as close as possible to the edges, therefore creating more efficient Sprites. Don't forget to name the Sprites either! For this project, we will call each Sprite set the following: ACSpriteEnviroBlock ACSpriteMenuBlock ACSpriteBonus ACSpriteHealth ACSpriteCake ACSpriteBackground ACSpriteCloud1...2...3...and so on To give you a better idea where each Sprite is located, have a look at the following screenshot. The Sprites are numbered so that you can easily locate them. Once you have done this, click on Apply in the top-right corner of the Sprite Editor. As a result, you should be able to see all the Sprites in the Project window by clicking on the triangle. It should look like the following screenshot: 9-slicing Sprites A nice little feature of Unity that allows you to scale elements such as Sprites without distortion is 9-slicing. Essentially, what 9-slicing does is allow you to reuse an image at various sizes without needing to prepare multiple Assets. As the name suggests, it involves splitting the image into nine segments. An example of this splitting is shown in the following screenshot: The following four points describe what will happen if you change the dimensions of the preceding image: If you change the four corners (A, C, G, and I), they will not change in size If you move sections B and H, they will stretch or tile horizontally If you move sections D and F, they will stretch or tile vertically If you move section E, the image will stretch or tile both horizontally and vertically You can see these four points illustrated in the following screenshot: By using 9-slicing, you can re-size the Sprite in different ways and keep the proportions. This is particularly useful for creating the walls within our environment that will create obstacles for our little Angel and enemies to navigate around. We will need to do this for our ACSpriteEnviroBlock so that we can place it within our level for the player to navigate around. To do this, we need to make sure that the Sprite that we have created has been set up properly. First, you need to make sure the Mesh Type is set to Full Rect. To do this, select the Angel_Cake_Sprite_Atlas (contained in Project window | Asset | Sprites), then head to the Inspector and change Mesh Type from Tight to Full Rect, like in the following screenshot: Now we need to define the borders of the Sprite. To do this, perform the following steps: First, select the Sprite (Angel_Cake_Sprite_Atlas). Next, in the Inspector, click the Sprite Editor button. Now, click on the Sprite that you want to apply the 9-slicing to. In our case, this will be the ACSpriteEnviroBlock, like in the following screenshot: Looking at the Sprite information box in the bottom-right corner, we need to adjust the values for the Borders of the Sprite. For this Sprite, we will use the value of 20 for L, R, T, and B (left, right, top, and bottom, respectively): In some cases, you might need to tweak the position of the borders; you can do this by clicking and dragging the green dots located at the intersections of each border (top, bottom, and sides). You can see this in the following screenshot: To test your 9-sliced Sprite, drag it from the Project window into the Scene, like in the following screenshot: Next, in the Inspector, change the Draw Mode from Simple to Sliced, like in the following screenshot: Now you can resize the ACSpriteEnviroBlock without it deforming. Give it a go! You should have something like the variations in the following screenshot:   You will notice that it isn't quite like the Sprite. This is okay, we can adjust this setting in the Inspector. Simply click on the Atlas Texture in the Project window and, in the Inspector, change the value of Pixels Per Unit to 250: Click Apply, then click and drag another ACSpriteEnviroBlock onto the Scene and try to resize it. You will end up with something like the following screenshot: As you can see, there is a little distortion. This just means that you will need to edit the Borders inside the Sprite Editor until you get the location of them correct. For now, tinker with the locations of the borders. To summarize, we have shown how to work with the Unity 2017 assets, and how you work and configure sprites for your 2D game projects effectively. If you have liked this article, then don't forget to check out the complete book Unity 2017 2D Game Development Projects   by Lauren S. Ferro & Francesco Sapio on the Packt store. Working with Unity Variables to script powerful Unity 2017 games Build an ARCore app with Unity from scratch Unity announces a new automotive division and two-day Unity AutoTech Summit
Read more
  • 0
  • 0
  • 7345

article-image-build-an-iot-application-with-aws-iot-tutorial
Gebin George
22 Jun 2018
23 min read
Save for later

Build an IoT application with AWS IoT [Tutorial]

Gebin George
22 Jun 2018
23 min read
Developing IoT applications has never been easier thanks to the cloud. All the major cloud vendors provide IoT tools; in this tutorial you'll learn how to build a complete IoT application with AWS IoT. This article is an excerpt from the book, Enterprise Internet of Things Handbook, written by Arvind Ravulavaru.  End-to-end communication To get started with AWS IoT, you need an AWS account. If you don't have an AWS account, you can create one here. Once you have created your account, you can log in and navigate to the AWS IoT Console. Setting up the IoT Thing Once you are on the AWS IoT Console page, make sure you have selected a region that is close to your location. I have selected the US East (N. Virginia) region as shown in the following screenshot: Now, click on the Get started button in the center of the page. From the side menu, navigate to Manage | Things and you should see a screen as shown here: Next, click on the Register a thing button and you should see a screen as shown here: Right now, we are going to onboard only one Thing. So, click on Create a single thing. On the next screen, we will start filling in the form by naming the device. I have called my device Pi3-DHT11-Node. You can give your Thing any name but do remember to update the code where applicable. Next, we are going to apply a Type. Since this is our first device, we are going to create a new Type. Click on Create a thing type and fill in the form as shown in the following screenshot: If we have different types of devices, such as motion sensors, door sensors, or DHT11 sensors, we can create a Type to easily group our nodes. Click on the Create thing type button and this will create a new type; select that value as the default. Next, we are going to add this device to a group—a group of Raspberry Pi 3, DHT11 nodes. You can group your devices as per your requirements and classification. Now, click on Create group and create it with the following values: We have added two attributes to identify this group easily, as shown in the previous screenshot. Click on the Create thing group and this will create a new group—select that value as the default. These are the only Things we are going to set up in this step. Your form should look something like this: At the bottom of the page, click on the Next button. Now, we need to create a certificate for the Thing. AWS uses certificate-based authentication and authorization to create a secure connection between the device and AWS IoT Core. For more information, refer to MQTT Security Fundamentals: X509 Client Certificate Authentication: https://www.hivemq.com/blog/mqtt-security-fundamentals-x509-client-certificate-authentication. The current screen should look as shown here: Under One-click certificate creation (recommended), click on the Create certificate button. This will create three certificates as illustrated in the following screenshot: Do not share these certificates with anyone. These are as good as the username and password of your device to post data to AWS IoT. Once the certificates are created, download the following: Client certificate: db80b0f635.cert.pem Public Key: db80b0f635.public.key Private Key: db80b0f635.private.key Root CA: From this URL you can download or copy the text: https://www.symantec.com/content/en/us/enterprise/verisign/roots/VeriSign-Class%203-Public-Primary-Certification-Authority-G5.pem My keys start with db80b0f635. Yours may start with something else. Once you have downloaded the keys, click on the Activate button. Once the activation is successful, click on Attach a policy. Since we did not create any policies, you will see a screen similar to as what is shown here: No issues with that. We will create a policy manually and associate it with this certificate in a moment. Finally, click on the Register Thing button and a new Thing named Pi3-DHT11-Node will be created. Click on Pi3-DHT11-Node and you should see something like this: We are not done with the setup yet. We still need to create a policy and attach it with a certificate to proceed. Navigate back to the Things page, and from the side menu on this page, select Secure | Policies: Now, click on Create a policy and fill in the form as demonstrated in the following screenshot: In the previously demonstrated policy, we are allowing any kind of IoT operation to be performed by the device that uses this policy and on any resource. This is a dangerous setup, mainly for production; however, this is okay for learning purposes. Click on the Create button and this will create a new policy. Now, we are going to attach this policy to a certificate. Navigate to Secure | Certificates and, using the options available at the top-right of the certificate we created, we are going to attach the policy: Click on Attach policy on the previous screen and select the policy we have just created: Now, click on Attach to complete the setup. With this, we are done with the setup of a Thing. In the next section, we are going to use Node.js as a client on Raspberry Pi 3 to send data to the AWS IoT. Setting up Raspberry Pi 3 on the DHT11 node Now that we have our Thing set up in AWS IoT, we are going to complete the remaining operation in Raspberry Pi to send data. Things needed You will need the following hardware to set up Raspberry Pi 3 on the DHT11 node: One Raspberry Pi 3: https://www.amazon.com/Raspberry-Pi-Desktop-Starter-White/dp/B01CI58722 One breadboard: https://www.amazon.com/Solderless-Breadboard-Circuit-Circboard-Prototyping/dp/B01DDI54II/ One DHT11 sensor: https://www.amazon.com/HiLetgo-Temperature-Humidity-Arduino-Raspberry/dp/B01DKC2GQ0 Three male-to-female jumper cables: https://www.amazon.com/RGBZONE-120pcs-Multicolored-Dupont-Breadboard/dp/B01M1IEUAF/ If you are new to the world of Raspberry Pi GPIO's interfacing, take a look at this Raspberry Pi GPIO Tutorial: The Basics Explained video tutorial on YouTube, at: https://www.youtube.com/watch?v=6PuK9fh3aL8. Connect the DHT11 sensor to Raspberry Pi 3 as shown in the following diagram: Next, start Raspberry Pi 3 and log in to it. On the desktop, create a new folder named AWS-IoT-Thing. Open a new Terminal and cd into this folder. Setting up Node.js If Node.js is not installed, please refer to the following steps: Open a new Terminal and run the following commands: $ sudo apt update $ sudo apt full-upgrade This will upgrade all the packages that need upgrades. Next, we will install the latest version of Node.js. We will be using the Node 7.x version: $ curl -sL https://deb.nodesource.com/setup_7.x | sudo -E bash - $ sudo apt install nodejs This will take a moment to install, and once your installation is done, you should be able to see the version of Node.js and NPM after running the following commands: $ node -v $ npm -v Developing the Node.js Thing app Now, we will set up the app and write the required code: From the Terminal, once you are inside the AWS-IoT-Thing folder, run the following command: $ npm init -y Next, we will install aws-iot-device-sdk (http://npmjs.com/package/aws-iot-device-sdk) from NPM. This module has the required client code to interface with AWS IoT. Execute the following command: $ npm install aws-iot-device-sdk --save Next, we will install rpi-dht-sensor (https://www.npmjs.com/package/rpi-dht-sensor) from NPM. This module will help in reading the DHT11 temperature and humidity values. Let's run the following command: $ npm install rpi-dht-sensor --save Your final package.json file should look like this: { "name": "AWS-IoT-Thing", "version": "1.0.0", "main": "index.js", "scripts": { "test": "echo "Error: no test specified" && exit 1" }, "keywords": [], "author": "", "license": "ISC", "description": "", "dependencies": { "aws-iot-device-sdk": "^2.2.0", "rpi-dht-sensor": "^0.1.1" } } Now that we have the required dependencies installed, let's continue: Create a new file named index.js at the root of the AWS-IoT-Thing folder. Next, create a folder named certs at the root of the AWS-IoT-Thing folder and move the four certificates we have downloaded there. Your final folder structure should look something like this: Open index.js in any text editor and update it as shown in the following code snippet: var awsIot = require('aws-iot-device-sdk'); var rpiDhtSensor = require('rpi-dht-sensor'); var dht = new rpiDhtSensor.DHT11(2); // `2` => GPIO2 const NODE_ID = 'Pi3-DHT11-Node'; const INIT_DELAY = 15; const TAG = '[' + NODE_ID + '] >>>>>>>>> '; console.log(TAG, 'Connecting...'); var thingShadow = awsIot.thingShadow({ keyPath: './certs/db80b0f635-private.pem.key', certPath: './certs/db80b0f635-certificate.pem.crt', caPath: './certs/RootCA-VeriSign-Class 3-Public-Primary-Certification-Authority-G5.pem', clientId: NODE_ID, host: 'a1afizfoknpwqg.iot.us-east-1.amazonaws.com', port: 8883, region: 'us-east-1', debug: false, // optional to see logs on console }); thingShadow.on('connect', function() { console.log(TAG, 'Connected.'); thingShadow.register(NODE_ID, {}, function() { console.log(TAG, 'Registered.'); console.log(TAG, 'Reading data in ' + INIT_DELAY + ' seconds.'); setTimeout(sendData, INIT_DELAY * 1000); // wait for `INIT_DELAY` seconds before reading the first record }); }); function fetchData() { var readout = dht.read(); var temp = readout.temperature.toFixed(2); var humd = readout.humidity.toFixed(2); return { "temp": temp, "humd": humd }; } function sendData() { var DHT11State = { "state": { "desired": fetchData() } }; console.log(TAG, 'Sending Data..', DHT11State); var clientTokenUpdate = thingShadow.update(NODE_ID, DHT11State); if (clientTokenUpdate === null) { console.log(TAG, 'Shadow update failed, operation still in progress'); } else { console.log(TAG, 'Shadow update success.'); } // keep sending the data every 30 seconds console.log(TAG, 'Reading data again in 30 seconds.'); setTimeout(sendData, 30000); // 30,000 ms => 30 seconds } thingShadow.on('status', function(thingName, stat, clientToken, stateObject) { console.log('received ' + stat + ' on ' + thingName + ':', stateObject); }); thingShadow.on('delta', function(thingName, stateObject) { console.log('received delta on ' + thingName + ':', stateObject); }); thingShadow.on('timeout', function(thingName, clientToken) { console.log('received timeout on ' + thingName + ' with token:', clientToken); }); In the previous code, we are using awsIot.thingShadow() to connect to our AWS Thing that we have created. To the awsIot.thingShadow(), we pass the following options: keyPath: This is the location of private.pem.key, which we have downloaded and placed in the certs folder. certPath: This is the location of certificate.pem.crt, which we have downloaded and placed in the certs folder. caPath: This is the location of RootCA-VeriSign-Class 3-Public-Primary-Certification-Authority-G5.pem, which we have downloaded and placed in the certs folder. clientId: This is the name of the Thing we have created in AWS IoT Pi3-DHT11-Node. host: This is the URL to which the Thing needs to connect to. This URL is different for different Things. To get the host, navigate to your Thing and click the Interact tab as shown in the following screenshot: The highlighted URL is the host. port: We are using SSL-based communication, so the port will be 8883. region: The region in which you have created the Thing. You can find this in the URL of the page. For example, https://console.aws.amazon.com/iot/home?region=us-east-1#/thing/Pi3-DHT11-Node. debug: This is optional. If you want see some logs rolling out from the module during execution, you can set this property to true. We will connect to the preceding host with our certificates. In the thingShadow.on('connect') callback, we call thingShadow.register() to register. We need to register only once per connection. Once the registration is completed, we will start to gather the data from the DHT11 sensor and, using thingShadow.update(), we will update the shadow. In the thingShadow.on('status') callback, we will get to know the status of the update. Save the file and execute the following command: $ sudo node index.js We should see something like this: As you can see from the previous logs on the console screen, the device first gets connected then registers itself. Once the registration is done, we will wait for 15 seconds to transmit the first record. Then we wait for another 30 seconds and continue the process. We are also listening for status and delta events to make sure that what we have sent has been successfully updated. Now, if we head back to the AWS IoT Thing page in the AWS Console and click on the Shadow tab, we should see the last record that we have sent update here: Underneath that, you can see the metadata of the document, which should look something like: { "metadata": { "desired": { "temp": { "timestamp": 1517888793 }, "humd": { "timestamp": 1517888793 } } }, "timestamp": 1517888794, "version": 16 } The preceding JSON represents the data structure of the Thing's data, assuming that the Thing keeps sending the same structure of data at all times. Now that the Thing is sending data, let us actually read the data coming from this Thing. Reading the data from the Thing There are two approaches as to how you can get the shadow data: Using the REST API: https://docs.aws.amazon.com/iot/latest/developerguide/device-shadow-rest-api.html Using MQTT-SNL: https://docs.aws.amazon.com/iot/latest/developerguide/device-shadow-mqtt.html The following example used the MQTTS approach to fetch the shadow data. Whenever we want to fetch the data of a Thing, we publish an empty packet to the $aws/things/Pi3-DHT11-Node/shadow/get topic. Depending on whether the state was accepted or rejected, we will get a response on $aws/things/Pi3-DHT11-Node/shadow/get/accepted or $aws/things/Pi3-DHT11-Node/shadow/get/rejected, respectively. For testing the data fetch, you can either use the same Raspberry Pi 3 or another computer. I am going to use my MacBook as a client that is interested in the data sent by the Thing. In my local machine, I am going to create the following setup, which is very similar to what we have done in Raspberry Pi 3: Create a folder named test_client. Inside the test_client folder, create a folder named certs and get a copy of the same four certificates we have used in Raspberry Pi 3. Inside the test_client folder, run the following command on the Terminal: $ npm init -y Next, install the aws-iot-device-sdk module using the following command: $ npm install aws-iot-device-sdk --save Create a file inside the test_client folder named index.js and update it as shown here: var awsIot = require('aws-iot-device-sdk'); const NODE_ID = 'Pi3-DHT11-Node'; const TAG = '[TEST THING] >>>>>>>>> '; console.log(TAG, 'Connecting...'); var device = awsIot.device({ keyPath: './certs/db80b0f635-private.pem.key', certPath: './certs/db80b0f635-certificate.pem.crt', caPath: './certs/RootCA-VeriSign-Class 3-Public-Primary-Certification-Authority-G5.pem', clientId: NODE_ID, host: 'a1afizfoknpwqg.iot.us-east-1.amazonaws.com', port: 8883, region: 'us-east-1', debug: false, // optional to see logs on console }); device.on('connect', function() { console.log(TAG, 'device connected!'); device.subscribe('$aws/things/Pi3-DHT11-Node/shadow/get/accepted'); device.subscribe('$aws/things/Pi3-DHT11-Node/shadow/get/rejected'); // Publish an empty packet to topic `$aws/things/Pi3-DHT11-Node/shadow/get` // to get the latest shadow data on either `accepted` or `rejected` topic device.publish('$aws/things/Pi3-DHT11-Node/shadow/get', ''); }); device.on('message', function(topic, payload) { payload = JSON.parse(payload.toString()); console.log(TAG, 'message from ', topic, JSON.stringify(payload, null, 4)); }); Update the device information as applicable. Save the file and run the following command on the Terminal: $ node index.js We should see something similar to what is shown in the following console output: This way, any client that is interested in the data of this Thing can use this approach to get the latest data. You can also use an MQTT library in the browser itself to fetch the data from a Thing. But do keep in mind this is not advisable as the certificates are exposed. Instead, you can have a backend microservice that can achieve the same for you and then expose the data via HTTPS. With this, we conclude the section on posting data to AWS IoT and fetching it. In the next section, we are going to work with rules. Building the dashboard Now that we have seen how a client can read the data of our Thing on demand, we will move on to building a dashboard, where we show data in real time. For this, we are going to use Elasticsearch and Kibana. Elasticsearch Elasticsearch is a search engine based on Apache Lucene. It provides a distributed, multi-tenant capable, full-text search engine with an HTTP web interface and schema-free JSON documents. Read more about Elasticsearch at http://whatis.techtarget.com/definition/ElasticSearch. Kibana Kibana is an open source data visualization plugin for Elasticsearch. It provides visualization capabilities on top of the content indexed on an Elasticsearch cluster. Users can create bar, line, and scatter plots, or pie charts and maps on top of large volumes of data. Read more about Kibana at https://www.elastic.co/products/kibana. As we have seen in the architecture diagram, we are going to create a rule in AWS IoT. The job of the rule is to listen to an AWS topic and then send the temperature and humidity values from that topic to an Elasticsearch cluster that we are going to create using the AWS Elasticsearch Service (https://aws.amazon.com/elasticsearch-service/). The cluster we are going to provision on AWS will also have a Kibana setup for easy visualizations. We are going to use Kibana and build the visualization and then a dashboard from that visualization. We are going to use Elasticsearch and Kibana for a basic use case. The reason I have chosen to use Elasticsearch instead of building a simple web application that can display charts is because we can do way more than just building dashboards in Kibana using Elasticsearch. This is where the IoT Analytics comes in. We are not going to explore IoT analytics per se, but this setup should give you an idea and get you started off. Setting up Elasticsearch Before we proceed further, we are going to provision a new Elasticsearch cluster. Do note that the cluster we are going to provision is under free tier and has a limitation of resources. Read more about the limitations at https://aws.amazon.com/about-aws/whats-new/2017/01/amazon-elasticsearch-service-free-tier-now-available-on-t2-small-elasticsearch-instances/. Neither Packt Publishing nor me is in any way responsible for any billing that happens as a by-product of running any example in this book. Please read the pricing terms carefully before continuing. To set up Elasticsearch, head over to the Amazon Elasticsearch Service console or use the services menu on the of AWS console page to reach the Amazon Elasticsearch Service console page. You should see a screen similar to what is shown here: Click on the Create a new domain button and fill in the next screen, as shown in the following screenshot: Click on the Next button. Under the Node configuration section, update it as illustrated here: If you are planning to run bigger queries, I would recommend checking Enable dedicated master and setting it up. Under the Storage configuration section, update it as illustrated in the following screenshot: Leave the remaining sections as the defaults and click Next to continue. On the Setup access screen, under Network configuration, select Public access, and for the access policy, select Allow open access to the domain and accept the risks associated with this configuration. We are using this setup to quickly work with services and to not worry about credentials and security. DO NOT use this setup in production. Finally, click Next, review the selections we have made, and click Confirm. Once the process has started, it will take up to 10 minutes for the domain to be provisioned. Once the domain is provisioned, we should see a similar screen to what is illustrated here: Here we have our Endpoint, to which data will be sent for indexing. And we also have the URL for Kibana. When we click on the Kibana URL, after it loads, you will be presented with the following screen:   The previous screen will change once we start indexing data. In the next section, we are going to create an IAM role. Setting up an IAM Role Now that we have Elasticsearch and Kibana up and running, we will get started with setting up an IAM role. We will be using this role for the IoT rule and to put data into Elasticsearch: To get started, head over to https://console.aws.amazon.com/iam. From the side menu, click on the Roles link and you should see a screen like this: Select AWS service from the top row and then select IoT. Click Next: Permissions to proceed to the next step: All the policies needed for AWS IoT access resources across AWS are preselected. The one we are interested in is under AWSIoTRulesActions and Elasticsearch Service. All we need here is the ESHttpPut action. Finally, click the Next: Review button and fill in the details as shown in the following screenshot: Once we click Create role, a new role with the name provided will be created. Now that we have Elasticsearch up and running, as well as the IAM role needed, we will create the IoT Rule to index incoming data into Elasticsearch. Creating an IoT Rule To get started, head over to AWS IoT and to the region where we have registered our Thing: From the menu on the left-hand side, click on Act and then click the Create a rule option. On the Create rule screen, we will fill in the details as shown in the following table: Field Value Name ES_Indexer Description Index AWS IoT topic data to Elasticsearch service SQL version 2016-03-23 Attribute cast(state.desired.temp as decimal) as temp, cast(state.desired.humd as decimal) as humd, timestamp() as timestamp Topic filter $aws/things/Pi3-DHT11-Node/shadow/update Condition Once we fill the form in with the information mentioned in the previous table, we should see the rule query statement as demonstrated here: SELECT cast(state.desired.temp as decimal) as temp, cast(state.desired.humd as decimal) as humd, timestamp() as timestamp FROM '$aws/things/Pi3-DHT11-Node/shadow/update' This query selects the temperature and humidity values from the $aws/things/Pi3-DHT11-Node/shadow/update topic and casts them to a decimal or float data type. Along with that, we select the timestamp. Now, under Select one or more actions, click Add action, and then select Send messages to the Amazon Elastic Service. Click on Configure action. On the Configure action screen, fill in the details as illustrated here: Field Value Domain Name pi3-dht11-dashboard Endpoint Will get auto selected ID ${newuuid()} Index sensor-data Type dht11 IAM role name iot-rules-role Once the details mentioned in the table are filled in, click on the Create action button to complete the setup. Finally, click on Create rule and a new rule should be created. Elasticsearch configuration Before we continue, we need to configure Elasticsearch to create a mapping. The timestamp that we generate in AWS IoT is of the type long. So, we are going to create a mapping field named datetime with the type date. From a command line with cURL (https://curl.haxx.se/) present, execute the following command: curl -XPUT 'https://search-pi3-dht11-dashboard-tcvfd4kqznae3or3urx52734wi.us-east-1.es.amazonaws.com/sensor-data?pretty' -H 'Content-Type: application/json' -d' { "mappings" : { "dht11" : { "properties" : { "timestamp" : { "type" : "long", "copy_to": "datetime" }, "datetime" : {"type": "date", "store": true } } } } } ' Replace the URL of Elasticsearch in the previous command as applicable. This will take care of creating a mapping when the data comes in. Running the Thing Now that the entire setup is done, we will start pumping data into the Elasticsearch: Head back to Raspberry Pi 3, which was sending the DHT11 temperature and humidity data, and run our application. We should see the data being published to the shadow topic: Head over to the Elasticsearch page, to the pi3-dht11-dashboard domain, and to the Indices tab, and you should see the screen illustrated here: Next, head over to the Kibana dashboard. Now we will configure the Index pattern as shown in the following screenshot: Do not forget to select the time filter. Click on Create and you should see the fields on the next screen, as shown here: Now, click on the Discover tab on the left-hand side of the screen and you should see the data coming in, as shown in the following screenshot: Building the Kibana dashboard Now that we have the data coming in, we will create a new visualization and then add that to our dashboard: Click on the Visualize link from the side menu and then click on Create a Visualization. Then under Basic Charts select Line. On the Choose search source screen, select sensor-data index and this will take us to the graph page. On the Metrics section in the Data tab, set the following as the first metric: Click on the Add metrics button and set up the second one as follows: Now, under Buckets, select X-Axis and select the following: Click on the Play button above this panel and you should see a line chart as follows: This is our temperature and humidity data over a period of time. As you can see, there are plenty of options to choose from regarding how you want to visualize the data: Now, click on the Save option at the top of the menu on the page and name the visualization Temperature & Humidity Visualization. Now, using the side menu, select Dashboard then Create Dashboard, click on Add, and select Temperature & Humidity Visualization. Now, click on Save from the top-most menu on the page and name the dashboard Pi3 DHT11 dashboard. Now we have our own dashboard, which show the temperature and humidity metrics: This wraps up the section on building a visualization using IoT Rule, Elasticsearch, and Kibana. With this, we have seen the basic features and implementation process needed to work with the AWS IoT platform. If you found this post useful, do check out the book,  Enterprise Internet of Things Handbook, to build a robust IoT strategy for your organization. 5 reasons to choose AWS IoT Core for your next IoT project Should you go with Arduino Uno or Raspberry Pi 3 for your next IoT project? How to run and configure an IoT Gateway
Read more
  • 0
  • 3
  • 20244
article-image-xamarin-how-to-add-a-mvvm-pattern-to-an-app-tutorial
Sugandha Lahoti
22 Jun 2018
13 min read
Save for later

Xamarin: How to add a MVVM pattern to an app [Tutorial]

Sugandha Lahoti
22 Jun 2018
13 min read
In our previous tutorial, we created a basic travel app using Xamarin.Forms. In this post, we will look at adding the Model-View-View-Model (MVVM) pattern to our travel app. The MVVM elements are offered with the Xamarin.Forms toolkit and we can expand on them to truly take advantage of the power of the pattern. As we dig into MVVM, we will apply what we have learned to the TripLog app that we started building in our previous tutorial. This article is an excerpt from the book Mastering Xamaring.Forms by Ed Snider. Understanding the MVVM pattern At its core, MVVM is a presentation pattern designed to control the separation between user interfaces and the rest of an application. The key elements of the MVVM pattern are as follows: Models: Models represent the business entities of an application. When responses come back from an API, they are typically deserialized to models. Views: Views represent the actual pages or screens of an application, along with all of the elements that make them up, including custom controls. Views are very platform-specific and depend heavily on platform APIs to render the application's user interface (UI). ViewModels: ViewModels control and manipulate the Views by serving as their data context. ViewModels are made up of a series of properties represented by Models. These properties are part of what is bound to the Views to provide the data that is displayed to users, or to collect the data that is entered or selected by users. In addition to model-backed properties, ViewModels can also contain commands, which are action-backed properties that bind the actual functionality and execution to events that occur in the Views, such as button taps or list item selections. Data binding: Data binding is the concept of connecting data properties and actions in a ViewModel with the user interface elements in a View. The actual implementation of how data binding happens can vary and, in most cases is provided by a framework, toolkit, or library. In Windows app development, data binding is provided declaratively in XAML. In traditional (non-Xamarin.Forms) Xamarin app development, data binding is either a manual process or dependent on a framework such as MvvmCross (https://github.com/MvvmCross/MvvmCross), a popular framework in the .NET mobile development community. Data binding in Xamarin.Forms follows a very similar approach to Windows app development. Adding MVVM to the app The first step of introducing MVVM into an app is to set up the structure by adding folders that will represent the core tenants of the pattern, such as Models, ViewModels, and Views. Traditionally, the Models and ViewModels live in a core library (usually, a portable class library or .NET standard library), whereas the Views live in a platform-specific library. Thanks to the power of the Xamarin.Forms toolkit and its abstraction of platform-specific UI APIs, the Views in a Xamarin.Forms app can also live in the core library. Just because the Views can live in the core library with the ViewModels and Models, this doesn't mean that separation between the user interface and the app logic isn't important. When implementing a specific structure to support a design pattern, it is helpful to have your application namespaces organized in a similar structure. This is not a requirement but it is something that can be useful. By default, Visual Studio for Mac will associate namespaces with directory names, as shown in the following screenshot: Setting up the app structure For the TripLog app, we will let the Views, ViewModels, and Models all live in the same core portable class library. In our solution, this is the project called TripLog. We have already added a Models folder in our previous tutorial, so we just need to add a ViewModels folder and a Views folder to the project to complete the MVVM structure. In order to set up the app structure, perform the following steps: Add a new folder named ViewModels to the root of the TripLog project. Add a new folder named Views to the root of the TripLog project. Move the existing XAML pages files (MainPage.xaml, DetailPage.xaml, and NewEntryPage.xaml and their .cs code-behind files) into the Views folder that we have just created. Update the namespace of each Page from TripLog to TripLog.Views. Update the x:Class attribute of each Page's root ContentPage from TripLog.MainPage, TripLog.DetailPage, and TripLog.NewEntryPage to TripLog.Views.MainPage, TripLog.Views.DetailPage, and TripLog.Views.NewEntryPage, respectively. Update the using statements on any class that references the Pages. Currently, this should only be in the App class in App.xaml.cs, where MainPage is instantiated. Once the MVVM structure has been added, the folder structure in the solution should look similar to the following screenshot: In MVVM, the term View is used to describe a screen. Xamarin.Forms uses the term View to describe controls, such as buttons or labels, and uses the term Page to describe a screen. In order to avoid confusion, I will stick with the Xamarin.Forms terminology and refer to screens as Pages, and will only use the term Views in reference to screens for the folder where the Pages will live, in order to stick with the MVVM pattern. Adding ViewModels In most cases, Views (Pages) and ViewModels have a one-to-one relationship. However, it is possible for a View (Page) to contain multiple ViewModels or for a ViewModel to be used by multiple Views (Pages). For now, we will simply have a single ViewModel for each Page. Before we create our ViewModels, we will start by creating a base ViewModel class, which will be an abstract class containing the basic functionality that each of our ViewModels will inherit. Initially, the base ViewModel abstract class will only contain a couple of members and will implement INotifyPropertyChanged, but we will add to this class as we continue to build upon the TripLog app throughout this book. In order to create a base ViewModel, perform the following steps: Create a new abstract class named BaseViewModel in the ViewModels folder using the following code: public abstract class BaseViewModel { protected BaseViewModel() { } } Update BaseViewModel to implement INotifyPropertyChanged: public abstract class BaseViewModel : INotifyPropertyChanged { protected BaseViewModel() { } public event PropertyChangedEventHandler PropertyChanged; protected virtual void OnPropertyChanged( [CallerMemberName] string propertyName = null) { PropertyChanged?.Invoke(this, new PropertyChangedEventArgs(propertyName)); } } The implementation of INotifyPropertyChanged is key to the behavior and role of the ViewModels and data binding. It allows a Page to be notified when the properties of its ViewModel have changed. Now that we have created a base ViewModel, we can start adding the actual ViewModels that will serve as the data context for each of our Pages. We will start by creating a ViewModel for MainPage. Adding MainViewModel The main purpose of a ViewModel is to separate the business logic, for example, data access and data manipulation, from the user interface logic. Right now, our MainPage directly defines the list of data that it is displaying. This data will eventually be dynamically loaded from an API but for now, we will move this initial static data definition to its ViewModel so that it can be data bound to the user interface. In order to create the ViewModel for MainPage, perform the following steps: Create a new class file in the ViewModels folder and name it MainViewModel. Update the MainViewModel class to inherit from BaseViewModel: public class MainViewModel : BaseViewModel { // ... } Add an ObservableCollection<T> property to the MainViewModel class and name it LogEntries. This property will be used to bind to the ItemsSource property of the ListView element on MainPage.xaml: public class MainViewModel : BaseViewModel { ObservableCollection<TripLogEntry> _logEntries; public ObservableCollection<TripLogEntry> LogEntries { get { return _logEntries; } set { _logEntries = value; OnPropertyChanged (); } } // ... } Next, remove the List<TripLogEntry> that populates the ListView element on MainPage.xaml and repurpose that logic in the MainViewModel—we will put it in the constructor for now: public MainViewModel() { LogEntries = new ObservableCollection<TripLogEntry>(); LogEntries.Add(new TripLogEntry { Title = "Washington Monument", Notes = "Amazing!", Rating = 3, Date = new DateTime(2017, 2, 5), Latitude = 38.8895, Longitude = -77.0352 }); LogEntries.Add(new TripLogEntry { Title = "Statue of Liberty", Notes = "Inspiring!", Rating = 4, Date = new DateTime(2017, 4, 13), Latitude = 40.6892, Longitude = -74.0444 }); LogEntries.Add(new TripLogEntry { Title = "Golden Gate Bridge", Notes = "Foggy, but beautiful.", Rating = 5, Date = new DateTime(2017, 4, 26), Latitude = 37.8268, Longitude = -122.4798 }); } Set MainViewModel as the BindingContext property for MainPage. Do this by simply setting the BindingContext property of MainPage in its code-behind file to a new instance of MainViewModel. The BindingContext property comes from the Xamarin.Forms.ContentPage base class: public MainPage() { InitializeComponent(); BindingContext = new MainViewModel(); } Finally, update how the ListView element on MainPage.xaml gets its items. Currently, its ItemsSource property is being set directly in the Page's code behind. Remove this and instead update the ListView element's tag in MainPage.xaml to bind to the MainViewModel LogEntries property: <ListView ... ItemsSource="{Binding LogEntries}"> Adding DetailViewModel Next, we will add another ViewModel to serve as the data context for DetailPage, as follows: Create a new class file in the ViewModels folder and name it DetailViewModel. Update the DetailViewModel class to inherit from the BaseViewModel abstract class: public class DetailViewModel : BaseViewModel { // ... } Add a TripLogEntry property to the class and name it Entry. This property will be used to bind details about an entry to the various labels on DetailPage: public class DetailViewModel : BaseViewModel { TripLogEntry _entry; public TripLogEntry Entry { get { return _entry; } set { _entry = value; OnPropertyChanged (); } } // ... } Update the DetailViewModel constructor to take a TripLogEntry parameter named entry. Use this constructor property to populate the public Entry property created in the previous step: public class DetailViewModel : BaseViewModel { // ... public DetailViewModel(TripLogEntry entry) { Entry = entry; } } Set DetailViewModel as the BindingContext for DetailPage and pass in the TripLogEntry property that is being passed to DetailPage: public DetailPage (TripLogEntry entry) { InitializeComponent(); BindingContext = new DetailViewModel(entry); // ... } Next, remove the code at the end of the DetailPage constructor that directly sets the Text properties of the Label elements: public DetailPage(TripLogEntry entry) { // ... // Remove these lines of code: //title.Text = entry.Title; //date.Text = entry.Date.ToString("M"); //rating.Text = $"{entry.Rating} star rating"; //notes.Text = entry.Notes; } Next, update the Label element tags in DetailPage.xaml to bind their Text properties to the DetailViewModel Entry property: <Label ... Text="{Binding Entry.Title}" /> <Label ... Text="{Binding Entry.Date, StringFormat='{0:M}'}" /> <Label ... Text="{Binding Entry.Rating, StringFormat='{0} star rating'}" /> <Label ... Text="{Binding Entry.Notes}" /> Finally, update the map to get the values it is plotting from the ViewModel. Since the Xamarin.Forms Map control does not have bindable properties, the values have to be set directly to the ViewModel properties. The easiest way to do this is to add a private field to the page that returns the value of the page's BindingContext and then use that field to set the values on the map: public partial class DetailPage : ContentPage { DetailViewModel _vm { get { return BindingContext as DetailViewModel; } } public DetailPage(TripLogEntry entry) { InitializeComponent(); BindingContext = new DetailViewModel(entry); TripMap.MoveToRegion(MapSpan.FromCenterAndRadius( new Position(_vm.Entry.Latitude, _vm.Entry.Longitude), Distance.FromMiles(.5))); TripMap.Pins.Add(new Pin { Type = PinType.Place, Label = _vm.Entry.Title, Position = new Position(_vm.Entry.Latitude, _vm.Entry.Longitude) }); } } Adding NewEntryViewModel Finally, we will need to add a ViewModel for NewEntryPage, as follows: Create a new class file in the ViewModels folder and name it NewEntryViewModel. Update the NewEntryViewModel class to inherit from BaseViewModel: public class NewEntryViewModel : BaseViewModel { // ... } Add public properties to the NewEntryViewModel class that will be used to bind it to the values entered into the EntryCell elements in NewEntryPage.xaml: public class NewEntryViewModel : BaseViewModel { string _title; public string Title { get { return _title; } set { _title = value; OnPropertyChanged(); } } double _latitude; public double Latitude { get { return _latitude; } set { _latitude = value; OnPropertyChanged(); } } double _longitude; public double Longitude { get { return _longitude; } set { _longitude = value; OnPropertyChanged(); } } DateTime _date; public DateTime Date { get { return _date; } set { _date = value; OnPropertyChanged(); } } int _rating; public int Rating { get { return _rating; } set { _rating = value; OnPropertyChanged(); } } string _notes; public string Notes { get { return _notes; } set { _notes = value; OnPropertyChanged(); } } // ... } Update the NewEntryViewModel constructor to initialize the Date and Rating properties: public NewEntryViewModel() { Date = DateTime.Today; Rating = 1; } Add a public Command property to NewEntryViewModel and name it SaveCommand. This property will be used to bind to the Save ToolbarItem in NewEntryPage.xaml. The Xamarin. Forms Command type implements System.Windows.Input.ICommand to provide an Action to run when the command is executed, and a Func to determine whether the command can be executed: public class NewEntryViewModel : BaseViewModel { // ... Command _saveCommand; public Command SaveCommand { get { return _saveCommand ?? (_saveCommand = new Command(ExecuteSaveCommand, CanSave)); } } void ExecuteSaveCommand() { var newItem = new TripLogEntry { Title = Title, Latitude = Latitude, Longitude = Longitude, Date = Date, Rating = Rating, Notes = Notes }; } bool CanSave () { return !string.IsNullOrWhiteSpace (Title); } } In order to keep the CanExecute function of the SaveCommand up to date, we will need to call the SaveCommand.ChangeCanExecute() method in any property setters that impact the results of that CanExecute function. In our case, this is only the Title property: public string Title { get { return _title; } set { _title = value; OnPropertyChanged(); SaveCommand.ChangeCanExecute(); } } The CanExecute function is not required, but by providing it, you can automatically manipulate the state of the control in the UI that is bound to the Command so that it is disabled until all of the required criteria are met, at which point it becomes enabled. Next, set NewEntryViewModel as the BindingContext for NewEntryPage: public NewEntryPage() { InitializeComponent(); BindingContext = new NewEntryViewModel(); // ... } Next, update the EntryCell elements in NewEntryPage.xaml to bind to the NewEntryViewModel properties: <EntryCell Label="Title" Text="{Binding Title}" /> <EntryCell Label="Latitude" Text="{Binding Latitude}" ... /> <EntryCell Label="Longitude" Text="{Binding Longitude}" ... /> <EntryCell Label="Date" Text="{Binding Date, StringFormat='{0:d}'}" /> <EntryCell Label="Rating" Text="{Binding Rating}" ... /> <EntryCell Label="Notes" Text="{Binding Notes}" /> Finally, we will need to update the Save ToolbarItem element in NewEntryPage.xaml  to bind to the NewEntryViewModel SaveCommand property: <ToolbarItem Text="Save" Command="{Binding SaveCommand}" /> Now, when we run the app and navigate to the new entry page, we can see the data binding in action, as shown in the following screenshots. Notice how the Save button is disabled until the title field contains a value: To summarize, we updated the app that we had created in this article; Create a basic travel app using Xamarin.Forms. We removed data and data-related logic from the Pages, offloading it to a series of ViewModels and then binding the Pages to those ViewModels. If you liked this tutorial, read our book, Mastering Xamaring.Forms , to create an architecture rich mobile application with good design patterns and best practices using Xamarin.Forms. Xamarin Forms 3, the popular cross-platform UI Toolkit, is here! Five reasons why Xamarin will change mobile development Creating Hello World in Xamarin.Forms_sample
Read more
  • 0
  • 0
  • 23627

article-image-build-an-iot-application-with-azure-iot-tutorial
Gebin George
21 Jun 2018
13 min read
Save for later

Build an IoT application with Azure IoT [Tutorial]

Gebin George
21 Jun 2018
13 min read
Azure IoT makes it relatively easy to build an IoT application from scratch. In this tutorial, you'll find out how to do it. This article is an excerpt from the book, Enterprise Internet of Things Handbook, written by Arvind Ravulavaru. This book will help you work with various trending enterprise IoT platforms. End-to-end communication To get started with Azure IoT, we need to have an Azure account. If you do not have an Azure account, you can create one by navigating to this URL: https://azure.microsoft.com/en-us/free/. Once you have created your account, you can log in and navigate to the Azure portal or you can visit https://portal.azure.com to reach the required page. Setting up the IoT hub The following are the steps required for the setup. Once we are on the portal dashboard, we will be presented with the dashboard home page as illustrated here: Click on +New from the top-left-hand-side menu, then, from the Azure Marketplace, select Internet of Things | IoT Hub, as depicted in the following screenshot: Fill in the IoT hub form to create a new IoT hub, as illustrated here: I have selected F1-Free for the pricing and selected Free Trial as a Subscription, and, under Resource group, I have selected Create new and named it Pi3-DHT11-Node. Now, click on the Create button. It will take a few minutes for the IoT hub to be provisioned. You can keep an eye on the notifications to see the status. If everything goes well, on your dashboard, under the All resources tile, you should see the newly created IoT hub. Click on it and you will be taken to the IoT hub page. From the hub page, click on IoT Devices under the EXPLORERS section and you should see something similar to what is shown in the following screenshot: As you can see, there are no devices. Using the +Add button at the top, create a new device. Now, in the Add Device section, fill in the details as illustrated here: Make sure the device is enabled; else you will not be able to connect using this device ID. Once the device is created, you can click on it to see the information shown in the following screenshot: Do note the Connection string-primary key field. We will get back to this in our next section. Setting up Raspberry Pi on the DHT11 node Now that we have our device set up in Azure IoT, we are going to complete the remaining operations on the Raspberry Pi 3 to send data. Things needed The things required to set up the Raspberry Pi DHT11 node are as follows: One Raspberry Pi 3: https://www.amazon.com/Raspberry-Pi-Desktop-Starter-White/dp/B01CI58722 One breadboard: https://www.amazon.com/Solderless-Breadboard-Circuit-Circboard-Prototyping/dp/B01DDI54II/ One DHT11 sensor: https://www.amazon.com/HiLetgo-Temperature-Humidity-Arduino-Raspberry/dp/B01DKC2GQ0 Three male-to-female jumper cables: https://www.amazon.com/RGBZONE-120pcs-Multicolored-Dupont-Breadboard/dp/B01M1IEUAF/ If you are new to the world of Raspberry Pi GPIO's interfacing, take a look at Raspberry Pi GPIO Tutorial: The Basics Explained video tutorial on YouTube: https://www.youtube.com/watch?v=6PuK9fh3aL8. The steps for setting up the smart device are as follows: Connect the DHT11 sensor to the Raspberry Pi 3 as shown in the following schematic: Next, power up the Raspberry Pi 3 and log into it. On the desktop, create a new folder named Azure-IoT-Device. Open a new Terminal and cd into this newly created folder. Setting up Node.js If Node.js is not installed, please refer to the following steps: Open a new Terminal and run the following commands: $ sudo apt update $ sudo apt full-upgrade This will upgrade all the packages that need upgrades. Next, we will install the latest version of Node.js. We will be using the Node 7.x version: $ curl -sL https://deb.nodesource.com/setup_7.x | sudo -Ebash- $ sudo apt install nodejs This will take a moment to install, and, once your installation is done, you should be able to run the following commands to see the versions of Node.js and NPM: $ node -v $ npm -v Developing the Node.js device app Now we will set up the app and write the required code: From the Terminal, once you are inside the Azure-IoT-Device folder, run the following command: $ npm init -y Next, we will install azure-iot-device-mqtt from NPM (https://www.npmjs.com/package/azure-iot-device-mqtt). This module has the required client code to interface with Azure IoT. Along with this, we are going to install the azure-iot-device (https://www.npmjs.com/package/azure-iot-device) and async modules (https://www.npmjs.com/package/async). Execute the following command: $ npm install azure-iot-device-mqtt azure-iot-device async --save Next, we will install rpi-dht-sensor from NPM (https://www.npmjs.com/package/rpi-dht-sensor). This module will help to read the DHT11 temperature and humidity values. Run the following command: $ npm install rpi-dht-sensor --save Your final package.json file should look like this: { "name":"Azure-IoT-Device", "version":"1.0.0", "description":"", "main":"index.js", "scripts":{ "test":"echo"Error:notestspecified"&&exit1" }, "keywords":[], "author":"", "license":"ISC", "dependencies":{ "async":"^2.6.0", "azure-iot-device-mqtt":"^1.3.1", "rpi-dht-sensor":"^0.1.1" } } Now that we have the required dependencies installed, let's continue. Create a new file named index.js at the root of the Azure-IoT-Device folder. Your final folder structure should look similar to the following screenshot: Open index.js in any text editor and update it as illustrated in the code snippet that can be found here: https://github.com/PacktPublishing/Enterprise-Internet-of-Things-Handbook. In the previous code, we are creating a new MQTTS client from the connectionString. You can get the value of this connection string from IoT Hub | IoT Devices | Pi3-DHT11-Node | Device Details | Connection string-primary key as shown in the following screenshot: Update the connectionString in our code with the previous values. Going back to the code, we are using client.open(connectCallback) to connect to the Azure MQTT broker for our IoT hub, and, once the connection has been made successfully, we call the connectCallback(). In the connectCallback(), we get the device twin using client.getTwin(). Once we have gotten the device twin, we will start collecting the data, send this data to other clients listening to this device using client.sendEvent(), and then send the copy to the device twin using twin.properties.reported.update, so any new client that joins gets the latest saved data. Now, save the file and run the sudo node index.js command. We should see the command output in the console of Raspberry Pi 3: The device has successfully connected, and we are sending the data to both the device twin and the MQTT event. Now, if we head back to the Azure IoT portal, navigate to IoT Hub | IoT Device | Pi3-DHT11-Node | Device Details and click on the device twin, we should see the last data record that was sent by the Raspberry Pi 3, as shown in the following image: Now that we are able to send the data from the device, let's read this data from another MQTT client. Reading the data from the IoT Thing To read the data from the device, you can either use the same Raspberry Pi 3 or another computer. I am going to use my MacBook as a client that is interested in the data sent by the device: Create a folder named test_client. Inside the test_client folder, run the following command: $ npm init --yes Next, install the azure-event-hubs module (https://www.npmjs.com/package/azure-event-hubs) using the following command: $ npm install azure-event-hubs --save Create a file named index.js inside the test_client folder and update it as detailed in the following code snippet: var EventHubClient = require('azure-event-hubs').Client; var connectionString = 'HostName=Pi3-DHT11-Nodes.azure-devices.net;SharedAccessKeyName=iothubowner;SharedAccessKey=J0MTJVy+RFkSaaenfegGMJY3XWKIpZp2HO4eTwmUNoU='; constTAG = '[TESTDEVICE]>>>>>>>>>'; var printError = function(err) { console.log(TAG, err); }; var printMessage = function(message) { console.log(TAG, 'Messagereceived:', JSON.stringify(message.body)); }; var client = EventHubClient.fromConnectionString(connectionString); client.open() .then(client.getPartitionIds.bind(client)) .then(function(partitionIds) { returnpartitionIds.map(function(partitionId) { returnclient.createReceiver('$Default', partitionId, { 'startAfterTime': Date.now() }) .then(function(receiver) { //console.log(TAG,'Createdpartitionreceiver:'+partitionId) console.log(TAG, 'Listening...'); receiver.on('errorReceived', printError); receiver.on('message', printMessage); }); }); }) .catch(printError); In the previous code snippet, we have a connectionString variable. To get the value of this variable, head back to the Azure portal, via IoT Hub | Shared access policies | iothubowner | Connection string-primary key as illustrated in the following screenshot: Copy the value from the Connection string-primary key field and update the code. Finally, run the following command: $ node index.js The following console screenshot shows the command's output: This way, any client that is interested in the data from this device can use this approach to get the latest data. You can also use an MQTT library on the client side to do the same, but do keep in mind that this is not advisable as the connection string is exposed. Instead, you can have a backend micro service that can achieve the same for you and then expose the data via HTTPS. With this, we conclude the section on posting data to Azure IoT and fetching that data. In the next section, we are going to work with building a dashboard for our data. Building a dashboard Now that we have seen how a client can read data from our device on demand, we will move to building a dashboard on which we will show data in real time. For this, we are going to use an Azure stream analytics job and Power BI. Azure stream analytics Azure stream analytics is a managed event-processing engine set up with real-time analytic computations on streaming data. It can gather data coming from various sources, collate it, and stream it into a different source. Using stream analytics, we can examine high volumes of data streamed from devices, extract information from that data stream, and identify patterns, trends, and relationships. Read more about Azure stream analytics. Power BI Power BI is a suite of business-analytics tools used to analyze data and share insights. A Power BI dashboard updates in real time and provides a single interface for exploring all the important metrics. With one click, users can explore the data behind their dashboard using intuitive tools that make finding answers easy. Creating dashboards and accessing them across various sources is also quite easy. Read more about Power BI. As we have seen in the architecture section, we are going to follow the steps given in the next section to create a dashboard in Power BI. Execution steps These are the steps that need to be followed: Create a new Power BI account. Set up a new consumer group for events (built-in endpoint). Create a stream analytics job. Set up input and outputs. Build the query to stream data from the Azure IoT hub to Power BI. Visualize the datasets in Power BI and build a dashboard. So let's get started. Signing up to Power BI Navigate to the Power BI sign-in page, and use the Sign up free option and get started today form on this page to create an account. Once an account has been created, validate the account. Log in to Power BI with your credentials and you will land on your default workspace. At the time of writing, Power BI needs an official email to create an account. Setting up events Now that we have created a new Power BI, let's set up the remaining pieces: Head back to https://portal.azure.com and navigate to the IoT hub we have created. From the side menu inside the IoT hub page, select Endpoints then Events under the Built-in endpoints section. When the form opens, under the Consumer groups section, create a new consumer group with the name, pi3-dht11-stream, as illustrated, and then click on the Save button to save the changes: Next, we will create a new stream analytics job. Creating a stream analytics job Let's see how to create a stream analytics job by following these steps: Now that the IoT hub setup is done, head back to the dashboard. From the top-left menu, click on +New, then Internet of Things and Stream Analytics job, as shown in the following screenshot: Fill in the New Stream Analytics job form, as illustrated here: Then click on the Create button. It will take a couple of minutes to create a new job. Do keep an eye on the notification section for any updates. Once the job has been created, it will appear on your dashboard. Select the job that was created and navigate to the Inputs section under JOB TOPOLOGY, as shown here: Click on +Add stream input and select IoT Hub, as shown in the previous screenshot. Give the name pi3dht11iothub to the input alias, and click on Save. Next, navigate to the Outputs section under JOB TOPOLOGY, as shown in the following screenshot: Click +Add and select Power BI, as shown in the previous screenshot. Fill in the details given in the following table: Field Value Output alias powerbi Group workspace My workspace (after completing the authorization step) Dataset name pi3dht11 Table name dht11 Click the Authorize button to authorize the IoT hub to create the table and datasets, as well as to stream data. The final form before creation should look similar to this: Click on Save. Next, click on Query under JOB TOPOLOGY and update it as depicted in the following screenshot: Now, click on the Save button. Next, head over to the Overview section, click on Start, select Now, and then click on Start: Once the job starts successfully, you should see the Status of Running instead of Starting. Running the device Now that the entire setup is done, we will start pumping data into the Power BI. Head back to the Raspberry Pi 3 that was sending the DHT11 temperature and humidity data, and run our application. We should see the data being published to the IoT hub as the Data Sent log gets printed: Building the visualization Now that the data is being pumped to Power BI via the Azure IoT hub and stream analytics, we will start building the dashboard: Log in to Power BI, navigate to the My Workspace that we selected when we created the Output in the Stream Analytics job, and select Datasets. We should see something similar to the screenshot illustrated here: Using the first icon under the ACTIONS column, for the pi3dht11 dataset, create a new report. When you are in the report page, under VISUALIZATIONS, select line chart, drag EventEnqueuedUtcTime to the Axis field, and set the temp and humd fields to the values as shown in the following screenshot: You can also see the graph data in real time. You can save this report for future reference. This wraps up the section of building a visualization using Azure IoT hub, a stream analytics job, and Power BI. With this, we have seen the basic features and implementation of an end to end IoT application with Azure IoT platform. If you found this post useful, do check out the book, Enterprise Internet of Things Handbook, to build end-to-end IoT solutions using popular IoT platforms. Introduction to IOT Introducing IoT with Particle's Photon and Electron Five developer centric sessions at IoT World 2018
Read more
  • 0
  • 3
  • 13728