Link Search Menu Expand Document

Code Security

Table of Contents

Knowing the Basics is Usually Enough

I'm Only Trying to Help You

What are some ways you secure your home or apartment against unwanted visitors? It might be keeping your doors and windows locked, always checking who’s at the door before you open it, or maybe installing a camera or two. Each of these things help, and are best if they’re all are used together. But if you really think about it, none of these things will protect you from someone who is bound and determined to get inside and isn’t afraid of getting caught or making a mess. Most doors and windows can be kicked in, and cameras only record evidence for later; they don’t prevent anything.

Luckily, most intruders ARE afraid of getting caught and don’t want to leave a mess with lots of evidence that they’ve been there. Instead of trying to break into a home with security cameras, floodlights, and locked doors, they’ll usually go for easier targets. They may choose buildings that aren’t locked or a place they already have a key to, even when the potential reward is smaller. The same is generally true of the internet.

On the internet, hacking is everywhere, and it can never be completely prevented. There are simply too many attackers in too many places. However, most successful hacking, even in high-profile cases, is successful because people didn’t follow basic security measures. That’s the cyber equivalent of leaving your front door unlocked or forgetting to turn on your cameras.

This guide is absolutely not comprehensive, but will cover the most fundamental security practices that will ensure that our web sites and web-connected apps are safe from most hacking. For more information on common security vulnerabilities to watch out for, a great place to start is the OWASP Top 10.

Don’t Leave Your Keys Out in the Open

Keep It Secret. Keep it Safe.

The most common way for individuals to get their bank accounts or emails compromised is by giving someone else their password, whether it’s someone they know or through a phishing attack. Software developers can get their apps, sites, servers, or databases compromised by leaving their keys or passwords out in the open where others can see them. This is the equivalent of locking all your doors but leaving a key under the mat where anyone could get it.

How to Keep Your Keys Safe

One common way to expose sensitive information is by posting it inside of other code when you commit code in GitHub or another repository. From your initial commit, you should be sure that you never have any passwords or keys in your code.

Let’s say that you’re building an app that at some point needs to connect to a database. This requires a valid username and password, as well as other information to locate the database. Your code will require something like this:

Python BAD Example

from flask import Flask
from flask_sqlalchemy import SQLAlchemy

app = Flask(__name__)

# Bad practice: Hardcoding sensitive information
app.config['SQLALCHEMY_DATABASE_URI'] = 'mysql://username:password@localhost/db_name'
app.config['SECRET_KEY'] = 'my_secret_key'

Database usernames and passwords should never be included in your code. Instead, these should be stored as variables in a special file, which is never included in your code commits to GitHub (.env in this example). The way to exclude these files is to list them in the .gitignore file (and .dockerignore if your project uses Docker). When you do this, since this file isn’t in the codebase, it is a courtesy to other programmers to add a template (.env.template) that IS included, so they know which variables they will need to provide for the software to work.

Python GOOD Example

# --- .env ---

DATABASE_URI=mysql://username:password@localhost/db_name
SECRET_KEY=my_secret_key


# --- .env.template ---

DATABASE_URI=[DB_URI_GOES_HERE]
SECRET_KEY=[DB_SECRET_GOES_HERE]


# --- .gitignore ---

.env


# --- main_script.py ---

import os
from dotenv import load_dotenv
from flask import Flask
from flask_sqlalchemy import SQLAlchemy

app = Flask(__name__)

# Load variables from .env file so you can access them in os.environ
load_dotenv()

# Access variables
app.config['SQLALCHEMY_DATABASE_URI'] = os.environ.get('DATABASE_URI')
app.config['SECRET_KEY'] = os.environ.get('SECRET_KEY')

You could also set the variables by running a bash script as part of the initialization process. This setup would be (in bash):

export DATABASE_URI='mysql://username:password@localhost/db_name'
export SECRET_KEY='my_secret_key'

Using GitHub Secrets and GitHub Actions

A great part of a workflow pipeline would be to set up GitHub Actions with a project. When a new batch of code is pushed to the main branch, it should be able to compile, test, and launch the app. Keys and secrets for production environments are even more important to safeguard, and one way to do that is using GitHub secrets.

Only a repo owner may add or change secrets in the repo, even though all developers can reference them in the code by variable name. Once the secrets have been saved securely, where even admins can’t see them again, you can add something like the following in your project under .github/workflows/main.yml. Notice where it passes in the secrets using syntax like {{ secrets.DATABASE_URI }}.

name: CI/CD Workflow

on:
  push:
    branches:
      - main

jobs:
  build:
    runs-on: ubuntu-latest

    env:
      DATABASE_URI: ${{ secrets.DATABASE_URI }}
      SECRET_KEY: ${{ secrets.SECRET_KEY }}

    steps:
      - name: Checkout repository
        uses: actions/checkout@v2

      - name: Set up Python
        uses: actions/setup-python@v2
        with:
          python-version: 3.8

      - name: Install dependencies
        run: |
          python -m pip install --upgrade pip
          pip install -r requirements.txt

      - name: Run tests
        run: |
          python -m unittest discover

      - name: Deploy to production
        if: github.event_name == 'push' && github.ref == 'refs/heads/main'
        run: |
          python deploy_script.py  # Your deployment script or command here
        env:
          DATABASE_URI: ${{ secrets.DATABASE_URI }}
          SECRET_KEY: ${{ secrets.SECRET_KEY }}

Safeguarding Other Internal Information

It’s not just keys and secrets that need to be kept private. Don’t put the URLs or IP addresses of internal development sites, servers, or databases in public places, like code repos or this handbook. Ideally, those kinds of resources are only accessible from internal IP addresses, like the UCF network, which is why you often need to have a VPN running when working from home. But even then, best practice is to keep that information off of publicly-accessible websites and repos. Instead, post that kind of information (and search for it) in our private Teams folders or the secret server.

Principle of Least Privilege (Deny By Default)

You. Shall Not. Pass!

A common security failing is when users who are not logged in can access things that only valid users should, or valid users can access things that only admins should. The “Principle of Least Privilege” is that users should only ever be given the bare minimum level of permission (or access, or data) that they need to perform their job. Denying access by default is something that needs to be considered when creating API keys, assigning AWS users and roles, and setting up AWS Virtual Private Networks (VPNs) and Security Groups. But this principle can be applied to every function and API call as well.

Most often when programming something, we are acting as the admin, and need to see all the data and all the features the software has available. Only later do we add some middleware or other form of permission checking to limit what data visitors or standard users can see. The problem there is that if we miss anything, data can be exposed. A better practice is to build everything from the ground up with permission absolutely denied. And then include code to see if the user is an admin or is logged in, and provide the expected data.

Python BAD example

def get_item_data(request):
    # Communicate with the database to get admin data
    return admin_level_data

# Before going live, remember to add a layer of security...
# Hint: You will probably forget.

Python GOOD example

# From the very beginning...

def get_item_data(request):
    # Assume debug_mode_on and current_user are defined elsewhere in the code
    if debug_mode_on or current_user.get('is_admin', False):   # False is the default
        # Communicate with the database to get admin data
        return admin_level_data

    elif current_user.get('is_logged_in', False):   # False is the default
        # Communicate with the database to get user data
        return user_data

    else:
        return None

Don’t Send (or Even Store) Data Unless You Have To

Who else might be watching?

When you are sending data back to the end user, it is often quicker and easier to code by sending entire objects or libraries back. Then on the front-end the data is always the same and we can decide to show or hide things there. But ALL data that is sent to the front end can be seen by any user who can use basic browser tools and it may also be stored on the network. Every time data is sent, it should be considered field by field, and only the bare minimum should be returned.

Consider the following setup using Python:

from flask import Flask, jsonify
from flask_sqlalchemy import SQLAlchemy

app = Flask(__name__)
app.config['SQLALCHEMY_DATABASE_URI'] = 'sqlite:///example.db'  # Remember, don't hard-code this
db = SQLAlchemy(app)

class User(db.Model):
    id = db.Column(db.Integer, primary_key=True)
    username = db.Column(db.String(80), unique=True, nullable=False)
    email = db.Column(db.String(120), unique=True, nullable=False)
    password = db.Column(db.String(255), nullable=False)
    # Add more attributes as needed

For every user, you need to save password information, but even admins shouldn’t be sent that data. How should you set up a route that will follow the Principle of Least Privilege?

Python BAD Example

@app.route('/user/<int:user_id>')
def get_user_bad(user_id):
    user = User.query.get(user_id)
    if user:
        # Sending the entire user object to the front-end
        return jsonify(user.__dict__)
    else:
        return jsonify({'error': 'User not found'}), 404

Python GOOD Example

@app.route('/user/<int:user_id>')
def get_user_good(user_id):
    user = User.query.get(user_id)
    if user:
        # Returning only specific attributes based on permissions
        if user_is_authorized_to_view(user):
            return jsonify({
                'id': user.id,
                'username': user.username,
                'email': user.email,
                # Add more attributes as needed
            })
        else:
            return jsonify({'error': 'Unauthorized to view user data'}), 403
    else:
        return jsonify({'error': 'User not found'}), 404

def user_is_authorized_to_view(user):
    # Add logic to check user's permissions
    # For example, check if the current user has administrative privileges
    # In a real application, this might involve roles, permissions, or other access control mechanisms.
    return currentUser.isAdmin

Encrypt in Transit, Encrypt at Rest

It's Some Form of Elvish. I Can't Read It.

Sensitive data can be stolen in two ways. First, imagine someone coming across the username and password of a database administrator. They can now log in and get access to the database and read what is there. This would be reading the data “at rest”. Another hacker, who doesn’t have direct access to the data, may be able to use other techniques to intercept packets of data being sent between the end user and the server. This is reading data “in transit” or “in flight”.

In both cases, we would ideally want for the hacker to not be able to read and use the information they’ve acquired. When setting up databases, it is important to ensure that data is encrypted while at rest. When setting up routing and APIs, it is important to make sure that data is encrypted in transit.

Generally, most programmers don’t deal directly with encryption. In our case, most of this is set up when provisioning AWS resources. But it’s an important principle to be aware of, and you can read more deeply about it on the OWASP website. But you can avoid some of the pitfalls related to cryptography by doing the following:

Use HTTPS, Not HTTP

The HTTPS standard is encrypted by default, where the HTTP is not. That means that requests sent over HTTP appear as plain text in the message and can easily be read when they are intercepted.

Don’t Make IDs Easy to Guess

When storing items in a database, make sure their IDs are randomly-generated GUIDs, and NOT ordered by number (like an auto-incrementing index in SQL). A common practice when viewing documents is to have their IDs exposed in the URL, like ‘www.mystore.com/items/251’. You would never want a site user to be able to guess the next ID, in this case, 252, which could potentially allow them to see documents or data they shouldn’t be allowed to, like in the First American Corporation data breach in 2017.

POST Whenever Possible

There are a lot of different ways to submit or request information from a server. The quickest and easiest is using a GET request, which is how most public-facing websites work. You type in a URL, and it sends the data you need to render the website. This could include rather specific subsites or documents, all of which are retrieved based on how specific your URL is. For instance, https://www.ucf.edu and https://www.ucf.edu/degree/biomedical-sciences-bs/molecular-microbiology/ are both GET requests. All the information that the server needs to render the page is right there in the URL. It can be bookmarked, refreshed, navigated to with forward and back, and you will always get the same data back. This can include variables, too (those come after a ? in the URL), which sometimes you WANT saved and bookmarked, like:

  • https://www.example.com/home?lang=es (saves the user’s preferred language)
  • https://www.example.com/article-title?page=2 (saves page number)
  • https://www.example.com/search?term=gaming+pc&brand=alienware (saves the search term and brand filter)

Most online resources will tell you that the difference between a GET request and a POST request is that a GET request is designed to “get” or fetch information from a server in a way that doesn’t change the server’s state or data, while a POST request is a way to send new information to the server, like when submitting a form. This is true, but there is some important nuance that this definition misses.

If you think about getting traditional mail, a GET request is like a postcard. In addition to the address there on the side so it can be delivered, the whole message is right there on the card for anyone to read. Like in the data breach in the previous section about IDs, if part of your GET message URL contains important information, any user can see it and possibly use it in unintented ways. On the other hand, a POST request is more like a letter than a postcard. The address for sender and receiver are out in the open on the envelope, but the message is safely tucked away inside, and can only be read once the envelope is opened by the receiver.

When making a data request from the front-end, it might be best to use POST requests instead of GET requests. POST requests allow you to send data to the server (like a user’s ID or the ID of the item being requested) without exposing those IDs in the URL.

GET BAD Example

// Front-end is in JavaScript

async function insecure_request(userID) {
  // The userID is exposed as part of the URL.
  const url = `http://site_URL/get_user_info_insecure?user_id=${userID}`

  // fetch() assumes the request is GET by default.
  const response = await fetch(url)
  const data = await response.json()
}
# Back-end is in Python

# This is assuming you're using Flask or something similar.
@app.route('/get_user_info_insecure', methods=['GET'])
def get_user_info_insecure():
  # This retrieves the exposed user_id parameter straight from the URL.
  user_id = request.args.get('user_id')
  # Do some processing to retrieve the user_info here.
  return jsonify({'user_info': user_info})

POST GOOD Example

// Front-end is in JavaScript

async function secure_request(userID) {
  // Nothing is exposed in the URL.
  const url = 'http://site_URL/get_user_info_secure'

  // fetch() can use its second parameter to format any request type.
  const response = await fetch(url, {
    method: 'POST',
    headers: {
        'Content-Type': 'application/json',
    },
    // The body is where all the private, secure data is sent.
    body: JSON.stringify({ user_id: userId }),
  })
  const data = await response.json()
}
# Back-end is in Python

# This is assuming you're using Flask or something similar.
@app.route('/get_user_info_secure', methods=['POST'])
def get_user_info_secure():
  # get_json() returns the body of the POST request. It "opens the envelope," so to speak.
  data = request.get_json()
  user_id = data.get('user_id')
  # Do some processing to retrieve the user_info here.
  return jsonify({'user_info': user_info})

For a deeper dive, read a more detailed description of GET, POST, and other HTTP methods or the fetch() documentation.

Don’t Encrypt On Your Own

Cryptographic best practices change more frequently than you might expect. Even functions that are helpfully built into some languages, like MD5 and SHA1 encryption or some hashing functions, are no longer recommended. Before diving into whatever StackOverflow tells you, refer to the official OWASP Cryptographic Cheat Sheet for guidance.

Validate Front-End, Validate Back-End

Fool of a Took!

Users need to interact with your app and the data behind it in a variety of ways. Any time we allow a user to enter values, and our software reads those values, you must validate that the input is both valid and not malicious.

Front-End Validation

When you click the “submit” button at the bottom of a form, by default, a POST request is sent to the server and more often than not, the page is refreshed or a new page is loaded, which will reset the form. Front-end validation is primarily used to protect the user from themselves. We can double-check a form and make sure that required fields are filled or that email addresses are properly formatted or that numbers are within accepted ranges BEFORE sending the POST request. The standard way to do this is by using the HTML form’s onsubmit attribute.

JavaScript GOOD Example

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>Form Validation Example</title>
    <style>
        .error {
            color: red;
        }
    </style>
</head>
<body>
    <h2>Registration Form</h2>
    <form id="registrationForm" method="post" onsubmit="return validateForm()">
        <label for="username">Username:</label>
        <input type="text" id="username" name="username">
        <span id="usernameError" class="error"></span>

        <br>

        <label for="password">Password:</label>
        <input type="password" id="password" name="password">
        <span id="passwordError" class="error"></span>

        <br>

        <input type="submit" value="Submit">
    </form>

    <script>
        function validateForm() {
            // Get form inputs
            var username = document.getElementById('username').value;
            var password = document.getElementById('password').value;

            // Reset error messages
            document.getElementById('usernameError').innerHTML = "";
            document.getElementById('passwordError').innerHTML = "";

            // Validate username
            if (username === "") {
                document.getElementById('usernameError').innerHTML = "Username is required";
                return false;
            }

            // Validate password
            if (password === "") {
                document.getElementById('passwordError').innerHTML = "Password is required";
                return false;
            }

            // Additional validation logic can be added here

            // If all validations pass, the form is considered valid
            return true;
        }
    </script>
</body>
</html>

Note that the validateForm() function above only checks against empty strings. You may want to use additional checks for validity (remove spaces, check string length, make sure password meets minimum requirements, verify username isn’t already taken, etc).

Back-End Valdiation

Even with front-end validation, it is relatively simple to intercept network traffic and change values before they are sent to the server. When data is sent to the back-end, assume that it cannot be trusted. No data, like a string entered by a user, should ever be used directly in a database query.

Python BAD Example

import sqlite3

def insecure_query(user_input):
    # Insecure: Directly using user input in the SQL query
    query = f"SELECT * FROM users WHERE username = '{user_input}'"
    
    # Connect to the database and execute the query
    connection = sqlite3.connect('example.db')
    cursor = connection.cursor()
    cursor.execute(query)
    
    # Fetch the results
    results = cursor.fetchall()
    
    # Process the results (omitted for brevity)
    
    # Close the connection
    connection.close()

# Example usage
user_input = "malicious_user' OR '1'='1' --"
insecure_query(user_input)

In this example, the SQL query becomes SELECT * FROM users WHERE username = 'malicious_user' OR '1'='1' --. This will select ALL the users from the database, since ‘1’=’1’ is always true, even when a user’s name isn’t ‘malicious_user’. This kind of hack will return data on all of the users in the database, potentially exposing sensitive information. This code would also be vulnerable to little Bobby Tables.

Instead, treat user input as a parameter that needs to be sanitized and escaped out before it can be trusted. To fix the code above…

Python GOOD Example

import sqlite3

def secure_query(user_input):
    # Secure: Using parameterized queries to prevent SQL injection
    query = "SELECT * FROM users WHERE username = ?"
    
    # Connect to the database and execute the query
    connection = sqlite3.connect('example.db')
    cursor = connection.cursor()
    cursor.execute(query, (user_input,))
    
    # Fetch the results
    results = cursor.fetchall()
    
    # Process the results (omitted for brevity)
    
    # Close the connection
    connection.close()

Use Logging for Recording and Notifications

I Have No Memory of This Place.

Even though logging cannot prevent problems, it is an incredibly important tool in identifying problems after they occur, much like a security camera. It is easiest to create logs all throughout the development process instead of tacked on at the end, where you are likely to forget to apply them in hard-to-find places or rare exceptions.

ABSOLUTELY Log:

  • Logins: Successful and unsuccessful.
  • Errors: Preferably with clear log messages to help with debugging.
  • Major Changes: In our case, these would be database calls with big consequences, like changing a user’s permissions, deleting a high-impact file, or changing admin-level settings.
  • Suspicious Activity: Think like a hacker. What sort of mischief could they hope to get into?

When using logs, be sure to:

  • Store logs remotely: If they are stored on the same server as your app and that server goes down, you can’t get to the logs, either.
  • Include a reporting tool: Send a message to Sentry or another tool to make sure people are notified of major errors.
  • Don’t name-drop: Strike a balance between providing useful information for debugging and protecting sensitive information.

Keep Things Up-to-Date

Really Old Documents

Really sophisticated hacks often rely on “Zero-Day Vulnerabilities,” which are flaws in code somewhere that hackers find and exploit before the developer is aware of it. Imagine a bunch of shouting and running around as programmers panic because they literally have zero days to fix this issue. This happens with even highly-vetted and highly-used software (think of all the times you’ve had to urgently update your browser). On the other hand, flaws are often found by people who are already working on the code, and they patch and update things before they can be exploited.

In both cases, the patch itself can tell hackers exactly where the issue was. Imagine that there is a popular open source library that has a new update, v2.15, which “Fixes a security issue that potentially allowed unauthenticated remote access.” Any programmer could look at the committed code for the new update and find exactly what was changed and deduce where that old vulnerability is and how to exploit it. But why would they do that if it was already patched and fixed?

Because if the hacker can find any software or site that is still using a pre-patch version below v2.15, they have an easy ticket in. For example, hackers accessed a U.S. federal government agency between June and July 2023 using a known Adobe ColdFusion vulnerability that had already been patched in March 2023.

Software that is not in active development needs to be maintained regularly, with all of its dependencies updated to the newest versions. This can absolutely cause bugs and issues that need to be fixed as part of the maintenance process, as dependencies become deprecated or are out of sync.

Specific Ways to Keep Up-to-Date

Put It On Your Calendar: Schedule when you will spend some time on any given project to update dependencies and retest that everything works. Make sure you give yourself enough time to do a slow and thorough update.

Use Your Package Manager to Help: If your project uses a package manager like npm or yarn, use the npm audit or yarn audit command to check for known vulnerabilities and npm outdated or yarn outdated to check for outdated packages.

Consider Version Ranges: In your package.json file, instead of requiring a single, specific version of a dependency, try something like "^1.2.3", which means any version compatible with 1.2.3.

Hand-Audit Dependency Use: See what dependencies are being used in the project and consider alternatives. Sometimes there are better alternatives available.

Document the Process: Include in your documentation the best way to maintain your specific software, with potential issues and quirks you’ve had to handle.