Getting started in R: Writing your own functions

computer science

Statistical Mediation & Moderation in Psychological Research (36)R is a programming software for statistical computing and graphics, and students and statisticians alike have come to rely on the software to analyze their data. The scope and power of the software is large, but for the purposes of getting started, it is important to understand the benefits of writing your own functions in R, along with how to do so.

One of the great advantages of R is there are so many built-in functions, and so many additional functions that come with packages. So, one might ask: why would you bother writing your own functions?

  • If you find yourself copy-and-pasting the same block of code over and over for different sets of data, putting that block of code within a function can prevent mistakes (like forgetting to change a variable name)
  • It makes your code neater and easier to understand for you and others – and as a result, easier to debug!
  • If you need to change part of the code that you are reusing, you don’t need to change it multiple times
When I first started coding in R, I found it challenging to write functions (and I still do!), but over time, I’ve realized how much more powerful, effective, and efficient my code is with functions. This is a skill that I am still developing. As such, don’t treat this blog as a comprehensive guide to writing functions – this is written for beginners who are looking to get their toes wet.

How to write your own functions

Below is the example we’ll be working with:

1-Aug-05-2020-01-35-54-07-PM

Here we have 3 different vectors, named “x” “y” and “z”. We want to calculate the mean of each vector and multiply each number in the vector by its mean. This example uses simple commands, but you can imagine how frustrating it would be if we had more than 3 vectors or more than 2 steps to perform.

So, to put the above code in your own function, there are a few basic steps.

  1. Choose a name for your function (usually something descriptive so you know what the function is generally doing).
  2. Determine what inputs are needed, and put them in the parentheses after “function.” The names that you give these inputs are placeholders to be used within the function.
  3. Put the block of code you want to run on each input after “function(…)” and enclosed by “{ }”.

Applying these steps to our example looks like this:

2-Aug-05-2020-01-36-07-83-PM

Here we’ve named our function “multiply_by_mean” and we have one input, or argument, that we’ve named “example” – note that this argument name can be anything, just as long as the name you put here is consistent with the body of the function. Then, we put the commands we want inside “{ }” and we return the output we’re interested in – in this case, the vector with each element inside multiplied by the mean of the vector.

To run this function, we simply replace the input argument name with the name of the vector we’re interested in, and we can define the output of the function with a suitable name.

3-Aug-05-2020-01-36-19-14-PM

With the function, we get the exact same results as above.

Let’s say we actually want to add 2 to the mean before multiplying by the vector. Having a function means we only need to change the code once inside the function, rather than changing the code each time we apply it to a different vector. Additionally, anyone reading this code can more easily see that this sequence of commands is a single operation.

4-Aug-05-2020-01-36-35-29-PM

5-Aug-05-2020-01-36-46-82-PM

A couple of additional notes about writing functions:

  • The “return(…)” command at the end of the function is actually not necessary, since R will automatically return whichever variable is on the last line in the function. I thought for beginners it would be helpful to see within the body of the function what will be outputted, but including the return statement is largely up to personal style.
  • Along a similar vein, it is possible to run “multiply_by_mean(…)” without defining the output as another variable name.

6-4

  • However, most of the time you want to use the output again (as the input in another function for example), so it’s good practice to store the outputs of functions.

Though it may seem more onerous at first, writing your own functions can save you time in the long run, and translates to cleaner and understandable code.

Computer science tutoring has increased dramatically since Cambridge Coaching was founded. It has become one our most popular subjects and we’ve been able to recruit some of the most talented doctoral candidates and software engineers to join our team. Many of our tutors are passionate coders, who love to share their computer wizardry with students. We routinely work with high school and undergraduate students looking to hone their programming skills in both customized tutorials, or alongside a course. If you are interested in learning to program, conquering the AP or GRE, or preparing for a career in technology, we can help you.

Contact us!

Check out some of our other blog posts on computer science below!

Can You Tell Which is Bigger? Set Cardinality, Injective Functions, and Bijections

What is Mathematical Induction (and how do I use it?)

What is the Difference Between Computer Science and Software Engineering?

 

Comments

topicTopics
academics study skills MCAT medical school admissions SAT expository writing college admissions English MD/PhD admissions strategy writing LSAT GMAT GRE physics chemistry math biology graduate admissions academic advice ACT interview prep law school admissions test anxiety language learning premed MBA admissions career advice personal statements homework help AP exams creative writing MD study schedules test prep computer science Common Application summer activities history mathematics philosophy organic chemistry secondary applications economics supplements research 1L PSAT admissions coaching grammar law psychology statistics & probability legal studies ESL CARS SSAT covid-19 dental admissions logic games reading comprehension engineering USMLE calculus PhD admissions Spanish mentorship parents Latin biochemistry case coaching verbal reasoning DAT English literature STEM excel medical school political science skills AMCAS French Linguistics MBA coursework Tutoring Approaches academic integrity chinese letters of recommendation Anki DO Social Advocacy admissions advice algebra art history artificial intelligence astrophysics business cell biology classics diversity statement gap year genetics geometry kinematics linear algebra mechanical engineering mental health presentations quantitative reasoning study abroad technical interviews time management work and activities 2L DMD IB exams ISEE MD/PhD programs Sentence Correction adjusting to college algorithms amino acids analysis essay athletics business skills careers cold emails data science dental school finance first generation student functions graphing information sessions international students internships logic networking poetry resume revising science social sciences software engineering tech industry trigonometry writer's block 3L AAMC Academic Interest EMT FlexMed Fourier Series Greek Health Professional Shortage Area Italian Lagrange multipliers London MD vs PhD MMI Montessori National Health Service Corps Pythagorean Theorem Python Shakespeare Step 2 TMDSAS Taylor Series Truss Analysis Zoom acids and bases active learning architecture argumentative writing art art and design schools art portfolios bacteriology bibliographies biomedicine brain teaser campus visits cantonese capacitors capital markets central limit theorem centrifugal force chemical engineering chess chromatography class participation climate change clinical experience community service constitutional law consulting cover letters curriculum dementia demonstrated interest dimensional analysis distance learning econometrics electric engineering electricity and magnetism escape velocity evolution executive function freewriting genomics harmonics health policy history of medicine history of science hybrid vehicles hydrophobic effect ideal gas law immunology induction infinite institutional actions integrated reasoning intermolecular forces intern investing investment banking lab reports linear maps mandarin chinese matrices mba medical physics meiosis microeconomics mitosis mnemonics music music theory nervous system neurology neuroscience object-oriented programming office hours operating systems