Recently when talking to my friend, he told me about a simple question he had been asked in an interview. The company was a well-reputed household brand, and compensation was also similarly one of the best he had heard in the industry. But the question wasn’t so common. The question simply was, “I am trying to create a file on Linux, but I can’t. What are the possible reasons this can happen?”
The question was unique but this type of question isn’t unique at all. Open-ended questions like these are meant to see the thought process of the candidate and his/her technical skill. I really liked this question because it isn’t bound to the specific experience or the technologies the candidate has used.
For example, questions such as “How to do XYZ in NodeJS”, or “How to configure NGINX”, in my opinion, are not great questions. Sure they can tell the interviewer if someone knows NodeJS or NGINX but do you really want to hire people who know NodeJS or NGINX specifically? What if you decide to build a new service in Django or use some web framework in Golang in the future? What if the company decides to move to a different proxy, like Envoy?
Questions like “I am trying to create a file on Linux, but I can’t. What are the possible reasons this can happen?” are good questions because it allows the candidate to show off his skills and experience in a wider domain. A DevOps/SRE guy might start to think about Linux specifically, thinking about the server, permissions, processes, or inodes. A backend guy with cloud experience might start to think about reasons related to cloud infra like misconfiguration in EBS volumes, misconfiguration issues in EFS volumes, etc., a full-stack developer may assume that an API is writing the file programmatically and think that the code is giving an error message, etc. None of these are wrong answers but they show the experience and the knowledge of the interviewee.
In this series, I want to take open-ended questions like these and discuss my answers in-depth, go over the technologies associated and how it all fits together. These questions probably have a million answers each so let me know what you think could be possible answers as well in the comments.
The question I want to discuss in this post is,
I am trying to create a file on Linux, but I can’t. What are the possible reasons this can happen?
Before you read my answers, try and think about it a bit and see what possible solutions you can think of. I have also sorted my answers from the easiest to most difficult for easier reading and have a few links to better understand the topic in each answer as well.
There is no disk space left
This is probably one of the easiest answers possible. Obviously, to create a file, there should be enough disk space in your secondary storage to create it. You can think of simple SSDs for on-prem servers or think of storage as cloud resources(like EBS volumes) in a cloud environment.
A follow-up question could be how would you know how much storage is left. There are a lot of ways to figure this out. My approach would be to check whatever metrics/dashboards you have already. A lot of AWS services provide you with metrics for disk usage, or you can create your own on CloudWatch metrics as well. There can be some configuration requirements, for example, installing an agent but apart from that, the process should be pretty simple.
You can also check out other third-party solutions, such as Nagios for example. There is a lot to talk about in monitoring and this post isn’t really meant for that but know that there are a lot of ways you can monitor your disk usage, set up alerts, and configure all of this.
Finally, a simple
df -h /is also a valid answer and is used, though generally in smaller applications or proof of concepts.
Another simple issue that can cause you to not create files is permission issues. Maybe the process or the user just does not have the right permissions to create files in that directory.
A possible counter-question could be how to check whether the user or the process has the right permissions. This is pretty dependent on how you have configured your servers. Running a simple
ls -l should tell you the permissions of any file/directory.
High CPU/memory consumption
Every command you execute would generally execute in its own process. Every process is bound to have some resources, and utilize some level of CPU, and memory.
At the simplest level, any running process will store a bunch of metadata information about itself, like open file descriptors, its PID, its state, etc. The OS would also try to run the process, and in our example, the process is likely to open a file, write to that file and close it. This requires a small amount of RAM and CPU. Without that, it is unlikely the process would be able to do its job properly.
So if your server RAM is full and/or running processes are utilizing 100% of your CPU, it is unlikely that you will be able to create a new file. In fact, the process that is trying to create the new file would also likely freeze. For example, if you are trying to create a new file by running
touch newfile in an SSH shell, if the CPU and memory consumption is high enough, it is unlikely you’d be able to SSH into the server.
Partitioning issues can also cause you to not be able to create new files. Maybe you have configured your EBS volume to have multiple partitions or maybe you decided to increase the size of your EBS volume but haven’t updated the OS filesystem yet or some other misconfiguration with volumes.
Partitioning in general is a big topic in itself so modifications to that might lead you to situations when you won’t be able to create a new file in your volume.
Network issues when using network file systems
You may be using a network file system based on a protocol like NFS or SMB. The way this works is that your files are technically stored in another server, but a different server can read and write those files using a network protocol.
A popular example of this is AWS’s EFS(Elastic File System). This service allows you to create a network file system, and then mount it as a volume in your EC2 instance. The official doc does a good job explaining what EFS is if you want to know more about it. This walkthrough of mounting and EFS filesystem on your EC2 is also a good resource.
Since your filesystem is now dependent on the network, issues that can cause a disconnection(for example, firewall, security group, NACL, subnet issues) between the servers can cause problems in writing, or reading files as well.
Inodes store metadata about a file or a directory in your system. This metadata stores information such as where the file is located, the size of the file, the number of links of the file, etc. All regular files would have an inode as well.
Your OS would have a limit on the number of inodes that can be created on every filesystem. Running
df -i shows you the maximum amount of inodes that can be created, the number of inodes currently used and the number of inodes free.
For example, for my personal laptop, I get the following output for my root fs
Filesystem Inodes IUsed IFree IUse% Mounted on
/dev/nvme0n1p2 31227904 2439357 28788547 8% /
Needless to say, creating 31,227,904 files is not common but Inode exhaustion can happen, for example. Moreover, if your inodes are exhausted, the error message would generally indicate that your disk is full, even though it may have a lot of space left.
Another interesting point to note is that some filesystems like ZFS don’t have limits on the number of inodes that can be allocated in the fs.
Running a command on bash generally involves two steps –
forkto create a new duplicate process
execto replace the new process with the process you want to run
Bash would do this without any intervention by you but you should be aware that every command you execute creates a new process.
If you are creating a file using something similar to
nano , then bash is likely creating a process for you, running that command in the process, and terminating the process once the command finishes. Since every new process needs a PID, spawning this process that creates a file also requires a PID.
PID exhaustion is theoretically possible, with the file
/proc/sys/kernel/pid_max defining the maximum amount of PIDs possible. In my laptop, for example, this value is 4,194,304. As this answer explains, there is no requirement for PIDs to be incremented, the only requirement is that PIDs of currently running processes should be unique. So to exhaust possible PIDs, you’d need 4 million+ processes running simultaneously which doesn’t seem like a common scenario.
There are other related limits to consider as well though, for example, the limits on the number of file descriptors that can be open at any point in time(I wrote an article on file descriptors if you want to read about them a bit more). Depending on the server configuration, and the specific Linux distribution you are using, there can be other forms of limits as well. This is a good resource if you want to read more about it.
Though this seems really unlikely to happen, it is still a valid answer and shows off your skill to think out of the box.
Another simple reason could be an outage in a service you are using for managing your file system. For example, outages in EBS can cause you to not be able to create files when using EBS volumes.
If you enjoyed this post, do check out a similar post I have written recently about yet another very popular interview question — Interview Questions: How to improve the performance of your database
Thanks for taking the time to read this. If you have any feedback, do let me know in the comments.