Getting started in R: Writing your own functions

computer science

Statistical Mediation & Moderation in Psychological Research (36)R is a programming software for statistical computing and graphics, and students and statisticians alike have come to rely on the software to analyze their data. The scope and power of the software is large, but for the purposes of getting started, it is important to understand the benefits of writing your own functions in R, along with how to do so.

One of the great advantages of R is there are so many built-in functions, and so many additional functions that come with packages. So, one might ask: why would you bother writing your own functions?

  • If you find yourself copy-and-pasting the same block of code over and over for different sets of data, putting that block of code within a function can prevent mistakes (like forgetting to change a variable name)
  • It makes your code neater and easier to understand for you and others – and as a result, easier to debug!
  • If you need to change part of the code that you are reusing, you don’t need to change it multiple times
When I first started coding in R, I found it challenging to write functions (and I still do!), but over time, I’ve realized how much more powerful, effective, and efficient my code is with functions. This is a skill that I am still developing. As such, don’t treat this blog as a comprehensive guide to writing functions – this is written for beginners who are looking to get their toes wet.

How to write your own functions

Below is the example we’ll be working with:

1-Aug-05-2020-01-35-54-07-PM

Here we have 3 different vectors, named “x” “y” and “z”. We want to calculate the mean of each vector and multiply each number in the vector by its mean. This example uses simple commands, but you can imagine how frustrating it would be if we had more than 3 vectors or more than 2 steps to perform.

So, to put the above code in your own function, there are a few basic steps.

  1. Choose a name for your function (usually something descriptive so you know what the function is generally doing).
  2. Determine what inputs are needed, and put them in the parentheses after “function.” The names that you give these inputs are placeholders to be used within the function.
  3. Put the block of code you want to run on each input after “function(…)” and enclosed by “{ }”.

Applying these steps to our example looks like this:

2-Aug-05-2020-01-36-07-83-PM

Here we’ve named our function “multiply_by_mean” and we have one input, or argument, that we’ve named “example” – note that this argument name can be anything, just as long as the name you put here is consistent with the body of the function. Then, we put the commands we want inside “{ }” and we return the output we’re interested in – in this case, the vector with each element inside multiplied by the mean of the vector.

To run this function, we simply replace the input argument name with the name of the vector we’re interested in, and we can define the output of the function with a suitable name.

3-Aug-05-2020-01-36-19-14-PM

With the function, we get the exact same results as above.

Let’s say we actually want to add 2 to the mean before multiplying by the vector. Having a function means we only need to change the code once inside the function, rather than changing the code each time we apply it to a different vector. Additionally, anyone reading this code can more easily see that this sequence of commands is a single operation.

4-Aug-05-2020-01-36-35-29-PM

5-Aug-05-2020-01-36-46-82-PM

A couple of additional notes about writing functions:

  • The “return(…)” command at the end of the function is actually not necessary, since R will automatically return whichever variable is on the last line in the function. I thought for beginners it would be helpful to see within the body of the function what will be outputted, but including the return statement is largely up to personal style.
  • Along a similar vein, it is possible to run “multiply_by_mean(…)” without defining the output as another variable name.

6-4

  • However, most of the time you want to use the output again (as the input in another function for example), so it’s good practice to store the outputs of functions.

Though it may seem more onerous at first, writing your own functions can save you time in the long run, and translates to cleaner and understandable code.

Computer science tutoring has increased dramatically since Cambridge Coaching was founded. It has become one our most popular subjects and we’ve been able to recruit some of the most talented doctoral candidates and software engineers to join our team. Many of our tutors are passionate coders, who love to share their computer wizardry with students. We routinely work with high school and undergraduate students looking to hone their programming skills in both customized tutorials, or alongside a course. If you are interested in learning to program, conquering the AP or GRE, or preparing for a career in technology, we can help you.

Contact us!

Check out some of our other blog posts on computer science below!

Can You Tell Which is Bigger? Set Cardinality, Injective Functions, and Bijections

What is Mathematical Induction (and how do I use it?)

What is the Difference Between Computer Science and Software Engineering?

 

Comments

topicTopics
academics MCAT study skills SAT medical school admissions expository writing English college admissions GRE GMAT LSAT MD/PhD admissions chemistry math physics ACT writing biology language learning strategy law school admissions graduate admissions MBA admissions creative writing homework help MD test anxiety AP exams interview prep summer activities history philosophy career advice premed academic advice ESL economics grammar personal statements study schedules admissions coaching law statistics & probability PSAT computer science organic chemistry psychology SSAT covid-19 CARS legal studies logic games USMLE calculus parents reading comprehension 1L Latin Spanish dental admissions DAT engineering excel political science French Linguistics Tutoring Approaches chinese research DO MBA coursework Social Advocacy case coaching classics genetics kinematics secondary applications skills verbal reasoning ISEE academic integrity algebra business business skills careers diversity statement geometry medical school mental health social sciences trigonometry 2L 3L Anki EMT FlexMed Fourier Series Greek IB exams Italian MD/PhD programs STEM Sentence Correction Zoom amino acids analysis essay architecture art history artificial intelligence astrophysics athletics biochemistry capital markets cell biology central limit theorem chemical engineering chromatography climate change clinical experience curriculum data science dental school finance first generation student functions gap year harmonics health policy history of medicine history of science information sessions integrated reasoning international students investing investment banking mba meiosis mitosis music music theory neurology phrase structure rules plagiarism presentations pseudocode sociology software software engineering teaching tech industry transfer typology virtual interviews work and activities writing circles