Getting started in R: Writing your own functions

computer science

Statistical Mediation & Moderation in Psychological Research (36)R is a programming software for statistical computing and graphics, and students and statisticians alike have come to rely on the software to analyze their data. The scope and power of the software is large, but for the purposes of getting started, it is important to understand the benefits of writing your own functions in R, along with how to do so.

One of the great advantages of R is there are so many built-in functions, and so many additional functions that come with packages. So, one might ask: why would you bother writing your own functions?

  • If you find yourself copy-and-pasting the same block of code over and over for different sets of data, putting that block of code within a function can prevent mistakes (like forgetting to change a variable name)
  • It makes your code neater and easier to understand for you and others – and as a result, easier to debug!
  • If you need to change part of the code that you are reusing, you don’t need to change it multiple times
When I first started coding in R, I found it challenging to write functions (and I still do!), but over time, I’ve realized how much more powerful, effective, and efficient my code is with functions. This is a skill that I am still developing. As such, don’t treat this blog as a comprehensive guide to writing functions – this is written for beginners who are looking to get their toes wet.

How to write your own functions

Below is the example we’ll be working with:

1-Aug-05-2020-01-35-54-07-PM

Here we have 3 different vectors, named “x” “y” and “z”. We want to calculate the mean of each vector and multiply each number in the vector by its mean. This example uses simple commands, but you can imagine how frustrating it would be if we had more than 3 vectors or more than 2 steps to perform.

So, to put the above code in your own function, there are a few basic steps.

  1. Choose a name for your function (usually something descriptive so you know what the function is generally doing).
  2. Determine what inputs are needed, and put them in the parentheses after “function.” The names that you give these inputs are placeholders to be used within the function.
  3. Put the block of code you want to run on each input after “function(…)” and enclosed by “{ }”.

Applying these steps to our example looks like this:

2-Aug-05-2020-01-36-07-83-PM

Here we’ve named our function “multiply_by_mean” and we have one input, or argument, that we’ve named “example” – note that this argument name can be anything, just as long as the name you put here is consistent with the body of the function. Then, we put the commands we want inside “{ }” and we return the output we’re interested in – in this case, the vector with each element inside multiplied by the mean of the vector.

To run this function, we simply replace the input argument name with the name of the vector we’re interested in, and we can define the output of the function with a suitable name.

3-Aug-05-2020-01-36-19-14-PM

With the function, we get the exact same results as above.

Let’s say we actually want to add 2 to the mean before multiplying by the vector. Having a function means we only need to change the code once inside the function, rather than changing the code each time we apply it to a different vector. Additionally, anyone reading this code can more easily see that this sequence of commands is a single operation.

4-Aug-05-2020-01-36-35-29-PM

5-Aug-05-2020-01-36-46-82-PM

A couple of additional notes about writing functions:

  • The “return(…)” command at the end of the function is actually not necessary, since R will automatically return whichever variable is on the last line in the function. I thought for beginners it would be helpful to see within the body of the function what will be outputted, but including the return statement is largely up to personal style.
  • Along a similar vein, it is possible to run “multiply_by_mean(…)” without defining the output as another variable name.

6-4

  • However, most of the time you want to use the output again (as the input in another function for example), so it’s good practice to store the outputs of functions.

Though it may seem more onerous at first, writing your own functions can save you time in the long run, and translates to cleaner and understandable code.

Computer science tutoring has increased dramatically since Cambridge Coaching was founded. It has become one our most popular subjects and we’ve been able to recruit some of the most talented doctoral candidates and software engineers to join our team. Many of our tutors are passionate coders, who love to share their computer wizardry with students. We routinely work with high school and undergraduate students looking to hone their programming skills in both customized tutorials, or alongside a course. If you are interested in learning to program, conquering the AP or GRE, or preparing for a career in technology, we can help you.

Contact us!

Check out some of our other blog posts on computer science below!

Can You Tell Which is Bigger? Set Cardinality, Injective Functions, and Bijections

What is Mathematical Induction (and how do I use it?)

What is the Difference Between Computer Science and Software Engineering?

 

Comments