Packt+ | Advance your knowledge in tech

0

All Products

Best Sellers

New Releases

Books

Videos

Audiobooks

Free Learning

Modern R Programming Cookbook

You're reading from Modern R Programming Cookbook Recipes to simplify your statistical applications

Product type Paperback

Published in Oct 2017

Publisher Packt

ISBN-13 9781787129054

Length 236 pages

Edition 1st Edition

Languages

R

Concepts

Programming Language

Author (1):

Jaynal Abedin

View More author details

Table of Contents (16) Chapters

Title Page

Credits

About the Author

About the Reviewer

www.PacktPub.com

Customer Feedback

Preface

1. Installing and Configuring R and its Libraries FREE CHAPTER

2. Data Structures in R

3. Writing Customized Functions

4. Conditional and Iterative Operations

5. R Objects and Classes

6. Querying, Filtering, and Summarizing

7. R for Text Processing

8. R and Databases

9. Parallel Processing in R

Extracting text data from an HTML page

You have seen an example of reading the HTML source code as a text vector in the Extracting unstructured text data from a plain web page recipe in this chapter. In this recipe, further processing is not straightforward because the output object contains plain text as well as HTML code tags. It is a time-consuming task to clean up the HTML tags from plain text.

In this recipe, you will read the same web page from the following link:

https://en.wikipedia.org/wiki/Programming_with_Big_Data_in_R

However, this time, you will use a different strategy so that you can play with HTML tags.

Getting ready

To implement this recipe, you need to use a customized library, particularly, the rvest library. If this library has not been installed into your computer, then this is the time to install it with its necessary dependencies. Here is the code to install the rvest library:

    install.packages("rvest", dependencies = T)

Once the installation has been completed, you are...

The rest of the chapter is locked

Register for a free Packt account to unlock a world of extra content!

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $15.99/month. Cancel anytime

Authors (1)

Jaynal Abedin

Jaynal Abedin

Jaynal Abedin currently holds the position of Statistician at the Centre for Communicable Diseases (CCD) at icddr,b ( www.icddrb.org). He attained his Bachelor's and Master's degrees in Statistics from the University of Rajshahi, Rajshahi, Bangladesh. He has vast experience in R programming and Stata and has efficient leadership qualities. He is currently leading a team of statisticians. He has hands-on experience in developing training material and facilitating training in R programming and Stata along with statistical aspects in public health research. His primary area of interest in research includes causal inference and machine learning. He is currently involved in several ongoing public health research projects and is a co-author of several work-in-progress manuscripts. In the useR! Conference 2013, he presented a poster—edeR: Email Data Extraction using R, available at http://www.edii.uclm.es/~useR-2013/abstracts/files/34_edeR_Email_Data_Extraction_using_R.pdf—and obtained the best application poster award. He is also involved in reviewing scientific manuscripts for the Journal of Applied Statistics (JAS) and the Journal of Health Population and Nutrition (JHPN). He is also a successful freelance statistician on online platforms and has an excellent reputation through his high-quality work, especially in R programming. He can be contacted at [email protected], http://bd.linkedin.com/in/jaynal; his Twitter handle is @jaynal83.

See other products by Jaynal Abedin

Other recommended products

Related to this chapter

SQL Server 2017 Machine Learning Services with R

SQL Server 2017 Machine Learning Services with R

With integrated R Services within SQL Server 2017, developers and data scientists can now benefit from the integrated, effective, efficient and more streamlined analytics environment. In this book, you will understand how to leverage the capabilities of R Services in SQL Server 2017. This short yet effective guide will help you get familiar with SQL Server 2017 R Services, and will show how to implement efficient data science models using it.

Feb 2018 11h 16m

R Programming Fundamentals

R Programming Fundamentals

Data analysis is crucial to accurately predict the performance of an application. The book begins by getting you started with R, including basic programming and data import, data visualization, pivoting, merging, aggregating, and joins. Once you are comfortable with the basics, you can read ahead and learn all about data visualization and graphics. You can learn data management, statistics and applications, forecasting, and reporting. With this various case studies and examples, this book gives you the knowledge to confidently start your career in the field of data science.

Sep 2018 6h 52m

R Programming By Example

R Programming By Example

R is a high-level statistical language and is widely used among statisticians and data miners to develop analytical applications. Based on the version 3.4, this book will help you develop strong fundamentals when working with R by taking you through a series of full representative examples, giving you a holistic view of R.

Dec 2017 15h 40m

Personalised recommendations for you

Based on your interests and search pattern

System Programming Essentials with Go

System Programming Essentials with Go

From file operations to process management and network programming, this hands-on guide equips software engineers with the skills to build efficient, reliable applications and optimize their performance.

Jun 2024 13h 36m

Technical Program Manager's Handbook

Technical Program Manager's Handbook

Unlock the full potential of your career as a TPM with this comprehensive guide, featuring new chapters on AI and more. Learn advanced techniques and gain insights from industry leaders. Elevate your skills and thrive in the Big Five tech companies.

Sep 2024 12h 16m

Systems Programming with C# and .NET

Systems Programming with C# and .NET

Unlock the full potential of C# and .NET Core in systems programming to secure, deploy, and maintain robust applications. With this book, you'll focus on low-level APIs, memory management, and performance optimization.

Jul 2024 15h 48m

System Design Guide for Software Professionals

System Design Guide for Software Professionals

This guide offers in-depth explanations of essential system design concepts, real-world case studies, and expert advice on acing tech interviews. Gain the knowledge to design scalable, high-performance systems and succeed in your career.

Aug 2024 12h 48m

Spring Boot 3.0 Cookbook

Spring Boot 3.0 Cookbook

Spring Boot 3.0 Cookbook demonstrates the latest features of this popular framework. You'll learn how to apply them to overcome challenges when building modern cloud applications.

Jul 2024 14h 12m

Refactoring with C++

Refactoring with C++

This book covers the most important software development techniques, libraries, and utilities unique to C++ that every C++ developer can use to refactor their code base. This will simplify development, enabling you to create performant apps.

Jul 2024 12h 16m

Microsoft Power Pages in Action

Microsoft Power Pages in Action

Packed with real-world examples, low-code coding techniques, and insights into crafting responsive pages, automating apps, and enhancing virtual agents, Microsoft Power Pages in Action is a valuable resource for building feature-rich web apps.

Jun 2024 11h 40m

This book will show you how to efficiently build applications using Vim's extensibility. You'll learn everything from advanced movement, text operations, and IDE features to refactoring, debugging, and customizing with Vimscript.

Jul 2024 10h 0m

Functional Programming with C#

Functional Programming with C#

This is a step-by-step guide to using existing C# tools to enhance programs with functional programming. It focuses on practical application with necessary theory and contains the latest C#12 examples and self-assessment questions.

Jul 2024 8h 36m

Edge Computing Simplified

Edge Computing Simplified

This book offers a holistic view of edge computing and the IoT space, helping you develop a foundational set of skills and learn terminologies to explore the subject in greater depth.

Jun 2024 5h 56m