Friday, June 1, 2018

Good Defaults for Technical Decisions

In my experience as a software engineer I've found a few "rules of thumb" for technical decisions. None of these are hard requirements or things that can never be false. However these are good guidelines if you don't have a reason to make a different decision. Unlike most engineering decisions which first present the constraints and then try and find a solution within them, this attempts to document decisions one should make if you didn't have any constraints in the first place.
Its possible you'll disagree with me on some of these and I'd like to understand why. That said, I'm not interested in specific projects where these are a bad idea but for an understanding about why these shouldn't be the default.
  1. Be explicit about your requirements. Don't automatically detect features, dependencies, or environment related issues. It is easier to change this later to make things more "magical" than go back and figure out what you need.
  2. Namespaces are good: even if you think you'll only ever need one. Its easier to modify in the future, versioning, etc.
  3. Errors ought to error. Warning ought to error or not exist. It is generally unhelpful to have noise in your output that you do nothing with. If a warning isn't meaningful, disable it.
  4. Keep scope local and private. Prefer hiding functions and information from the outside unless you have to decided to make this an API.
  5. Naming your first version as a "v1" and label it as such. During rewrites, migrations, or related issues prefer versions rather than names such as "next" or "new". There will likely be many "nexts" and only one v2.
  6. Structured is better than unstructured. Similar to the point about explicitness: it is easier to go from structured to unstructured than vice versa.
  7. Fixed is better than editable. Don't let people change things unless there is a reason to. This also applies to code (immutable is better than mutable).
  8. Don't rely on people not making mistakes. Even if you have perfect people, they might be tired, have something in their eye, misremember a fact, or otherwise be operating at a sub-optimal state.
  9. Name same things the same and different things differently. Use, and accept, the same formats for the same thing at all layers of the stack. As a counter example ruby outputs missing gems as name-version but gem(1) expects name:version.
This is a work in progress document and I'll try and update it over time

Saturday, February 17, 2018

Some rules for designing libc style APIs

  1. Identifiers should not have vowels; they are expensive and difficult to type.
  2. An identifier must not be longer than 8 characters. The only exception are functions intended for standardization like sched_ss_init_budget.
  3. Functions must not be reentrant. Relying on internal state means you can avoid allocating memory.
  4. Functions should take at least two parameters. The second parameter should be a "flags" parameter which causes the function to do entirely different things.
  5. Flags should be passed as macros with unspecified values. These macros must not have reasonable values.
  6. Error handling must be done in one of two ways. The choice must not be consistent with other functions in the library:
    1. The real return value should be stored in an "out" parameter. The return value must only determine if an error has occurred or not.
    2. If an error occurs, the return value must be undefined. The return value can't be safely used without checking for errors using a separate function (e.g., fgets).
  7. The error code should be in errno, requiring the setting of `errno = 0` beforehand and checking after an error occurs. However, the return value should be a value legally allowed to be in errno, so that initial attempts to use the function appear to work.
  8. If the function returns a string, it must do so by modifying a memory location given as a parameter. Whether or not the string is terminated with a null must be determined solely based on the length of the output, a user supplied parameter, and choice of compiler.
Thanks to jp, okdana for the inspiration and review; thanks to gonzo, arbrock for review.

Sunday, November 27, 2016

Papers We Read

Some months ago I started a reading group at my workplace focussed on distributed systems. The goal of the group was to be an informal meeting to discuss a mixture of high impact, historical, and modern papers.

This is the list of papers we read:
  1. Lamport Time Clocks
  2. Spanner: Google’s Globally-Distributed Database
  3. The Chubby Lock Service for Loosely-Coupled
  4. A note on distributed computing
  5. The Byzantine Generals Problem
  6. Your computer is already a distributed system. Why isn't your OS?
  7. How Complex Systems Fail
  8. Fast and Message-Efficient Global Snapshot Algorithms for Large-Scale Distributed Systems
  9. Automatic Management of Partitioned, Replicated Search Services
  10. Simple Testing Can Prevent Most Critical Failures
  11. Dynamo: Amazon’s Highly Available Key-value Store
  12. Wait-free coordination for Internet-scale system
  13. Brewer’s Conjecture and the Feasibility of Consistent, Available, Partition-Tolerant Web Services
  14. SEDA: An Architecture for Well-Conditioned, Scalable Internet Services
  15. Kafka: a Distributed Messaging System for Log Processing
  16. DistributedLog: A high performance replicated log service 
  17. The Log: What every software engineer should know about real-time data's unifying abstraction
  18. Social Hash: an Assignment Framework for Optimizing Distributed Systems Operations on Social Networks
  19. Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications 
  20. MapReduce: Simplified Data Processing on Large Clusters
  21. Mesos: A Platform for Fine-Grained Resource Sharing in the Data Center
  22. MillWheel: Fault-Tolerant Stream Processing at Internet Scale
  23. Snowflake - Unique ID Generation. “No two snowflakes are alike.”
  24. The Hadoop Distributed File System
  25. Gorilla: A Fast, Scalable, In-Memory Time Series Database
  26. Implementing Fault-Tolerant Services Using the State Machine Approach: A Tutorial
  27. Meltdown
  28. Spectre Attacks: Exploiting Speculative Execution
  29. Communicating Sequential Processes
  30. The Tail at Scale
  31. Consistent Hashing and Random Trees: Distributed Caching Protocols for Relieving Hot Spots on the World Wide Web
  32. Dapper, A Large Scale Distributed Systems Tracing Infrastructure
  33. The many faces of consistency
  34. SDPaxos: Building efficient semi-decentralized geo-replicated state machines
  35. Dataflow Model
  36. How to read a paper
  37. Jupiter Rising: A Decade of Clos Topologies andCentralized Control in Google’s Datacenter Network
  38. Characterizing, Modeling, and Benchmarking RocksDB Key-Value Workloads at Facebook
  39. Harvest, Yield, and Scalable Tolerant Systems
  40. Lineage stash: fault tolerance off the critical path.
  41. F1 Query: Declarative Querying at Scale
  42. The Architectural Implications of Facebook’s DNN-based Personalized Recommendation
  43. EFLOPS: Algorithm and System Co-design for a High Performance Distributed Training Platform
Other papers mentioned but not discussed:
Updated 2018-05-31: added additional papers
Updated 2018-06-20: added additional papers. Linkified a few more
Updated 2018-10-06: added additional papers. Linkified a few more
Updated 2019-07-10: added additional papers. Linkified a few more
Updated 2022-04-05: added additional papers. Linkified a few more

Monday, June 15, 2015

Blogging My Way Through CLRS Section 4.1

After another long break of not writing up any CLRS answers here is section 4.1.
Question 4.1-1:

What does $\textit{Find-Maximum-Subarray}$ return when all elements of $A$ are negative?

The procedure would return the single element of maximum value. This is expected since the maximum subarray must contain at least one element. This can be computed by note that $\textit{Find-Max-Crossing-Subarray}$ will always return the array of solely the midpoint and that $\textit{Find-Maximum-Subarray}$ always finds the maxium of $\{leftsum, rightsum, and crosssum\}$

Question 4.1-2:

Write pseudocode for the brute-force method of solving the max-subarray problem. Your solution should run in $\theta(n^2)$ time.

max_i = nil
max_j = nil
max_sum = -∞

for i in 0..len(A):
   cur_sum = 0
   for j in i..len(A):
     cur_sum += A[j]
     if cur_sum > max_sum:
       max_sum = cur_sum
       max_i = i
       max_j = j
return (max_i, max_j, max_sum)

Question 4.1-3:

Implement both the brute-force and recursive algorithms for the maximum-subarray problem on your own computer. What problem size $n_0$ gives the crossover point at which the recursive algorithm beats the brute-force algorithm? Then, change the base case of the recursive algorithm to use the brute-force algorithm whenever the problem size is less than $n_0$. Does that change the crossover point?

This question asks a question that is specific to the implementation, and the computer on which it is run. I will therefore be skipping it in this writeup. It is worthwhile to note that it is almost guarenteed that changing he implementation to use the brute force method for values less than $n_0$ is very likely to change $n_0$.

Question 4.1-4:

Suppose we change the definition of the maximum-subarray problem to allow the result to be an empty subarray, where the sum of the values of an empty subarray is 0. How would you change any of the algorithms that do not allow empty subarrays to permit an empty subarray to be the result?

For the brute force algorithm it would be rather trivial to add a check, and if the return max_sum is > 0 return the empty array.

For the recursive divide and conquer algorithm is is sufficient to just change the $\textit{Find-Max-Crossing-Subarray}$ in a manner similar to the brute force method. If $\textit{Find-Max-Crossing-Subarray}$ return the correct value, then $\textit{Find-Maximum-Subarray}$ will do the correct thing.

Question 4.1-5:

Develop a nonrecursive linear-time algorithm for the maximum-subarray problem.[1]

If one knows a previous answer to the max-subarray problem for a given prefix of the array than any new element consists of only two cases: being part of the maximum subarray or not being part of the maximum subarray. It is easier to explain with pseudocode: max_start = 0
max_end = 0
max_sum = A[0]

max_with_j = A[0]
for j in 1..len(A):
  # If J is in a maximum-subarray, either j is going to being the maximum on its, or it will will add to the current max
  max_with_j = max(A[j], max_with_j + x)
  Determine if J is in a maximum-subarray
  if max_with_j >= max_sum:
    max_sum = max_with_j
    max_end = j
    #Set the starting value if j is the sole element of a new subarray
    if max_with_j == A[j]:
      max_start = j
return (max_start, max_end, cur_max)

  1. The question provides some hints as to the solution of the problem.

Sunday, March 29, 2015

FreeBSD SMB Client under OSX Host

I recently purchased a new Macbook Pro and wanted to get a FreeBSD Virtual Machine set up in order to continue doing development work on it. Unfortunately, FreeBSD as a guest does not support native folder sharing so I decided to try using a samba mounted.

I decided to set up my VM to have two network interfaces: a NATed interface for internet access and a host-only interface for access to SMB and ssh.

The NAT networking configuration looks like:

NetworkName:    FreeBSDNatNetwork
IPv6 Enabled:   Yes
IPv6 Prefix:
DHCP Enabled:   Yes
Enabled:        Yes
Port-forwarding (ipv4)
        SSH IPv4:tcp:[]:5022:[]:22
Port-forwarding (ipv6)
        FreeBSD ssh:tcp:[]:6022:[fd17:625c:f037:2:a00:27ff:fefc:9dab]:22
loopback mappings (ipv4)

The Host-Only networking configuration looks like:

Name:            vboxnet0
GUID:            786f6276-656e-4074-8000-0a0027000000
DHCP:            Disabled
IPV6NetworkMaskPrefixLength: 0
HardwareAddress: 0a:00:27:00:00:00
MediumType:      Ethernet
Status:          Up
VBoxNetworkName: HostInterfaceNetworking-vboxnet0
The FreeBSD configuration looks like this: The OSX sharing configuration looks like:

Unfortunately, when attempting to actually mount the SMB filesystem with: mount_smbfs -I //eax@ I get the error mount_smbfs: can't get server address: syserr = Operation timed out

I tried installing the package net/samba36 and found that I needed the --signing=off flag to let it work:

It seems based on this setup and research that FreeBSD can not natively mount an OSX samba share. It might be possible to use sysutils/fusefs-smbnetfs. Other people have recommended NFS or sshfs.

Sunday, November 3, 2013

Two Factor Authentication for SSH (with Google Authenticator)

Two factor authentication is a method of ensuring that a user has a physical device in addition to their password when logging in to some service. This works by using a time (or counter) based code which is generated by the device and checked by the host machine. Google provides a service which allows one to use their phone as the physical device using a simple app.

This service can be easily configured and greatly increases the security of your host.

Installing Dependencies

  1. There is only one: the Google-Authenticator software itself:
    # pkg install pam_google_authenticator
  2. On older FreeBSD intallations you may use:
    # pkg_add -r pam_google_authenticator
    On Debian derived systems use:
    # apt-get install libpam-google-authenticator

User configuration

Each user must run "google-authenticator" once prior to being able to login with ssh. This will be followed by a series of yes/no prompts which are fairly self-explanatory. Note that the alternate to time-based is to use a counter. It is easy to lose track of which number you are at so most people prefer time-based.
  1. $ google-authenticator
    Do you want authentication tokens to be time-based (y/n)
    Make sure to save the URL or secret key generated here as it will be required later.

Host Configuration

To enable use of Authenticator the host must be set up to use PAM which must be configured to prompt for Authenticator.
  1. Edit the file /etc/pam.d/sshd and add the following in the "auth" section prior to pam_unix:
    auth requisite
  2. Edit /etc/ssh/sshd_config and uncomment
    ChallengeResponseAuthentication yes

Reload ssh config

  1. Finally, the ssh server needs to reload its configuration:
    # service sshd reload

Configure the device

  1. Follow the instructions provided by Google to install the authentication app and setup the phone.

That is it. Try logging into your machine from a remote machine now

Thanks bcallah for proof-reading this post.