Writing a simple Filesystem using Fuse and java 17

Contents

Introduction

The goal is to explore the Foreign Linker functionalities of Project Panama and to create our simple filesystem. We are going to do this using Java 17 and FUSE. We will be looking at how to do upcalls, downcalls and use memory addresses to create our in-memory filesystem.

What will the file system be able to do?

It is going to do the basics of what you would expect of a filesystem. We can mount it, create/read/write to files, create directories, and unmount it. The focus is on the Foreign Linker functionalities. To keep the implementation easy to understand, we will not implement subdirectories. You would only have to create a Java class that keeps track of where files are created; if you want to add that functionality.

What are Fuse and Project Panama

FUSE (Filesystem in Userspace) lets you create your userspace filesystem if you implement their interface. The FUSE project consists of two components: the FUSE kernel module and the libfuse userspace library. Our implementation will use the high-level API from libfuse. It provides functions to mount the file system, unmount it, read requests from the kernel, and send responses back.

Project Panama is a collection of improvements for the Java language. The goal of the project is to enrich and improve the connection between Java en native (foreign) interfaces that usually are used by applications written in C.

Panama consists of the following JEPs (JDK Enhancement Proposal):

Foreign-Memory Access API JEPs: JEP-370, JEP-383
Foreign Linker API JEP: JEP-389
Vector API JEP: JEP-338

We will focus on the Foreign Linker API as it offers pure Java access to native code. Another benefit of using the Foreign linker is that it should have comparable performance or be better than JNI.

Setup

Before you start, make sure you have FUSE installed on your Linux/Mac system (If you are using Windows, you can use WSL1 or WSL2 to follow along or any Linux VM). I used libfuse 3.10.5 for the examples. There might be slight differences if you are using an older or newer version. Running ldconfig -p | grep libfuse in a terminal will show you what version of Libfuse is installed. You will not get any output if Libfuse is not installed.

We also need the Jextract, it is a tool to generate Java files from the C header files and is only available in the Panama early access build. Go to https://jdk.java.net/panama/ and download the latest version for your system and unzip it. We only need this specific version of Java to generate the Java files. The project we are going to build can use any Java 17 GA release.

Also, download and unzip the version of the Libfuse source code that is installed on your system at (https://github.com/libfuse/libfuse/releases) we will use it as input for Jextract.

Setup to run and compile the application.

Because the foreign linker is still in incubation, we have to add some arguments for running and compiling the code. If you are also using IntelliJ, you need to add --add-modules jdk.incubator.foreign to the Java compiler options inside settings. and add --enable-native-access=ALL-UNNAMED --add-modules jdk.incubator.foreign to the VM options inside the run configuration.

Let's start!

First, we will generate the Java files from the Libfuse source. We need to set up our Java version first to do that. Run this inside a terminal:

1
2
export JAVA_HOME={JAVA_DOWNLOAD_LOCATION}/jdk-17
export PATH=$JAVA_HOME/bin:$PATH

To test that jextract is working run:

1
jextract -h

This should show you all the available command-line options:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
-C <String> - specify arguments to be passed to the underlying Clang parser
-I <String> - specify include files path
-l <String> - specify a library (name or full absolute path) which should be linked when the generated API is loaded
-d <String> - specify where to place generated files
-t <String> specify the target package for the generated classes
--include-function - name of function to include
--include-macro - name of constant macro to include
--include-struct - name of struct definition to include
--include-typedef - name of type definition to include
--include-union - name of union definition to include
--include-var - name of global variable to include
--source - generate java sources instead of classfiles

Creating the Java classes with Jextract

When everything is set up, we can create the Java files from the FUSE source. At the time of writing, I could not find a way to let Jextract include the FUSE_USE_VERSION macro. To fix this issue I added #define FUSE_USE_VERSION 35 to the top of fuse.h in the libfuse-fuse-3.10.5/include/ directory.

Once that is done you can fill in “LIBFUSE_SOURCE_DOWNLOAD_LOCATION”, “LIBFUSE_SOURCE_DOWNLOAD_LOCATION” and run the command to generate the java files.

1
jextract -C "-D_FILE_OFFSET_BITS=64"   --source -d generated/src -t org.linux -I {LIBFUSE_SOURCE_DOWNLOAD_LOCATION}/libfuse-fuse-3.10.5/include/ {LIBFUSE_SOURCE_DOWNLOAD_LOCATION}/libfuse-fuse-3.10.5/include/fuse.h 

-C "-D_FILE_OFFSET_BITS=64" passes an argument to the c language parser
--source -d generated/src The location we want Jextract to output our files.
-t org.linux the classpath that the generated java files will have.
-I {LIBFUSE_SOURCE_DOWNLOAD_LOCATION}/libfuse-fuse-3.10.5/include/ the include filepath
{LIBFUSE_SOURCE_DOWNLOAD_LOCATION}/libfuse-fuse-3.10.5/include/fuse.h the header file we want to use

Implementing FUSE

We now have all the building pieces, so let get started! As we saw earlier FUSE is just an interface we need to implement.

It is not like implementing a Java interface. It is done by calling a C function and passing it a structure (something like Java Records) that holds pointers to Java methods. You can find the structure inside the fuse.h file; it looks like this:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
struct fuse_operations {
   int (*getattr) (const char *, struct stat *, struct fuse_file_info *fi);
   int (*readlink) (const char *, char *, size_t);
   int (*mknod) (const char *, mode_t, dev_t);
   int (*mkdir) (const char *, mode_t);
   int (*unlink) (const char *);
   int (*rmdir) (const char *);
   int (*symlink) (const char *, const char *);
   int (*rename) (const char *, const char *, unsigned int flags);
   int (*link) (const char *, const char *);
   int (*chmod) (const char *, mode_t, struct fuse_file_info *fi);
   int (*chown) (const char *, uid_t, gid_t, struct fuse_file_info *fi);
   int (*truncate) (const char *, off_t, struct fuse_file_info *fi);
   int (*open) (const char *, struct fuse_file_info *);
   int (*read) (const char *, char *, size_t, off_t, struct fuse_file_info *);
   int (*write) (const char *, const char *, size_t, off_t, struct fuse_file_info *);
   int (*statfs) (const char *, struct statvfs *);
   int (*flush) (const char *, struct fuse_file_info *);
   int (*release) (const char *, struct fuse_file_info *);
   int (*fsync) (const char *, int, struct fuse_file_info *);
   int (*setxattr) (const char *, const char *, const char *, size_t, int);
   int (*getxattr) (const char *, const char *, char *, size_t);
   int (*listxattr) (const char *, char *, size_t);
   int (*removexattr) (const char *, const char *);
   int (*opendir) (const char *, struct fuse_file_info *);
   int (*readdir) (const char *, void *, fuse_fill_dir_t, off_t,
         struct fuse_file_info *, enum fuse_readdir_flags);
   int (*releasedir) (const char *, struct fuse_file_info *);
   int (*fsyncdir) (const char *, int, struct fuse_file_info *);
   void *(*init) (struct fuse_conn_info *conn,struct fuse_config *cfg);
   void (*destroy) (void *private_data);
   int (*access) (const char *, int);
   int (*create) (const char *, mode_t, struct fuse_file_info *);
   int (*lock) (const char *, struct fuse_file_info *, int cmd,struct flock *);
    int (*utimens) (const char *, const struct timespec tv[2],
          struct fuse_file_info *fi);
   int (*bmap) (const char *, size_t blocksize, uint64_t *idx);

#if FUSE_USE_VERSION < 35
   int (*ioctl) (const char *, int cmd, void *arg,
            struct fuse_file_info *, unsigned int flags, void *data);
#else
   int (*ioctl) (const char *, unsigned int cmd, void *arg,
            struct fuse_file_info *, unsigned int flags, void *data);
#endif

   int (*poll) (const char *, struct fuse_file_info *,
           struct fuse_pollhandle *ph, unsigned *reventsp);
   int (*write_buf) (const char *, struct fuse_bufvec *buf, off_t off,
           struct fuse_file_info *);
   int (*read_buf) (const char *, struct fuse_bufvec **bufp,
          size_t size, off_t off, struct fuse_file_info *);
   int (*flock) (const char *, struct fuse_file_info *, int op);
   int (*fallocate) (const char *, int, off_t, off_t,
           struct fuse_file_info *);
   ssize_t (*copy_file_range) (const char *path_in,
                struct fuse_file_info *fi_in,
                off_t offset_in, const char *path_out,
                struct fuse_file_info *fi_out,
                off_t offset_out, size_t size, int flags);
   off_t (*lseek) (const char *, off_t off, int whence, struct fuse_file_info *);
};

Do not let the long list scare you. We are only implementing:

getattr Called when you read the attributes of a file
readdir Called when you read a directory
read Called when you read from a file
mkdir Called when you create a directory
mknod Called when you create a file
write Called when you write to a file

When something happens inside the filesystems, for example creating a directory, FUSE will call the method that mkdir points to. The same happens when you read a file. The method that read points to will be called.

Helper methods

There is some code that we will be using more often. So, putting that in a few methods will make the rest of the code clearer.

At the class level, we add two lists and a map. We use the lists to keep track of the directories and files we have created. The map is used to retrieve the content of a file.

1
2
3
static List<String> directories = new ArrayList<>();
static List<String> files = new ArrayList<>();
static Map<String, String> filesContent = new HashMap<>();

And these are three small methods to add a file or check if it is a known directory or a file.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
static boolean isDir(String path) {
   return directories.contains(path);
}

static void addFile(String filename) {
    files.add(filename);
    filesContent.put(filename,"");
}

static boolean isFile(String path) {
    return files.contains(path);
}

Creating a fuse_operations in Java

We start with this class:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
import jdk.incubator.foreign.*;
import org.linux.*; // A

import java.util.Arrays;

public class SecondMain {

    static ResourceScope rsScope = null; 

    public static void main(String... args) {

        System.load("/usr/lib64/libfuse3.so.3.10.5");  // B

        args = new String[]{"-f", "-d", "/mnt/test/"};  // C

        try (var scope = ResourceScope.newSharedScope()) {  // D
            rsScope = scope;
            var arguments = Arrays.stream(args).map(s -> CLinker.toCString(s, scope)).toArray(MemorySegment[]::new); // E
            var allocator = SegmentAllocator.ofScope(scope);  // F
            var argumentCount = args.length;
            var argumentSpace = allocator.allocateArray(CLinker.C_POINTER, arguments); // G  

            MemorySegment operationsMemorySegment = fuse_operations.allocate(scope);  // H
        }

    }
}

At line “A” we import all the classes that we generated in the step before. At Line “B” we load the libfuse library that we want to use. You can use ldconfig -p | grep libfuse to find where it is located on your system.

At “C” we create an array with parameters we want to pass to FUSE. -f is to keep it on the foreground, so we can see any output in the console. -d will make FUSE also print any debug information to the console. /mnt/test/ is the mount point.

A resource scope manages the lifecycle of one or more resources like memory segments. We create a SharedScope at “D” because Fuse runs multithreaded at default. You can use -s to make it run single-threaded if you want.

At line “E” we take a Java String and transform it into a C String that is usable by the C function. The result is an array of MemorySegment. Then on Line “F”, we create a SegmentAllocator that we can use at line “G” to allocate memory for our array of MemorySegment.

Line “H” Shows us how we allocate for the fuse operations. fuse_operations is the name of the class that Jextract generated for us. It has a method allocate to allocate memory for itself inside the shared scope.

Implementing getAttr

This is the signature as we know it from fuse_operations.

1
int (*getattr) (const char *, struct stat *, struct fuse_file_info *fi);

This is the signature from the Jextract generated Java classes. It is part of a functional interface, so we can provide an implementation.

1
int apply(MemoryAddress x0, MemoryAddress x1, MemoryAddress x2);

For our implementation, we will not add fuse_file_info fi in the signature. Because we will not use it.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
public static int getAttr(MemoryAddress path, MemoryAddress mStat) {
    String jPath = CLinker.toJavaString(path);                      // A
    MemorySegment statMemorySegment = stat.ofAddress(mStat, rsScope);   // B

    int S_IFDIR = 0040000; /* directory */
    int S_IFREG = 0100000; /* regular */

    // setting the stat atim (last access time)
    Instant now = Instant.now();
    timespec.tv_sec$set(stat.st_atim$slice(statMemorySegment), now.getEpochSecond()); // C
    timespec.tv_nsec$set(stat.st_atim$slice(statMemorySegment), now.getNano());

    // setting the stat mtim (last modify time)
    now = Instant.now();
    timespec.tv_sec$set(stat.st_mtim$slice(statMemorySegment), now.getEpochSecond());
    timespec.tv_nsec$set(stat.st_mtim$slice(statMemorySegment), now.getNano());

    stat.st_uid$set(statMemorySegment, 1000); // D
    stat.st_gid$set(statMemorySegment,1000);

    if ("/".equals(jPath) || isDir(jPath.substring(1))) {
    stat.st_mode$set(statMemorySegment, (short) (S_IFDIR | 0755)); // E
    stat.st_nlink$set(statMemorySegment, 2);                       // F
    } else if (isFile(jPath.substring(1))) {
    stat.st_mode$set(statMemorySegment, (int )(S_IFREG | 0644));
    stat.st_nlink$set(statMemorySegment, 1);
    stat.st_size$set(statMemorySegment, filesContent.get(jPath.substring(1)).getBytes().length); // G
    } else {
    return -2;          // H
    }

    return 0; // I
}

On line “A” we convert a C Strings to a Java String. We know from the Fuse signature that the first parameter is the file path we want the attributes from. The second MemoryAddress in the parameters is the stat structure that we need to fill with the attributes of the requested file or directory. To access the stat structure, we need its MemorySegment that we get at line “B”.

At line “C” we set the last access time. To set the time, we need to call timespec.tv_sec$set and set the seconds on a specific part of the stat memory segment that we obtain by calling tat.st_mtim$slice(statMemorySegment).

The user id and group id are set at line D. To make it easy, we just set them 1000 now. In C you would call getuid() and getgid() to get real values.

Inside the if at line “E” we set st_mode what specifies if it is a directory or normal file, and we set the permission bits. We also do this for files. One difference is that we set set_nlink to two for directories. You can read here why this is done. (https://unix.stackexchange.com/questions/101515/why-does-a-new-directory-have-a-hard-link-count-of-2-before-anything-is-added-to/101536#101536).

For files, we also need to set the size. Here we Just convert the String to a byte array and use its size.

At Line “H” we return 2 what is equal to ENOENT in C what means that there is no such file or directory. A component of a specified pathname did not exist, or the pathname was an empty string.

We return 0 at the end to let FUSE know that we are done and that everything went fine.

why do we need to do Path.substring(1)?

Fuse will pass us a path beginning with a /. When we later create the implementations for mkdir and mknod, we will only store the name and not their path. So, we do not need the /.

Implementing readDir

The next thing we are implementing is the readDir. This method is called when you want to know what files and directories are available inside the given directory.

The FUSE signature looks like this:

1
int (*readdir) (const char *, void *, fuse_fill_dir_t, off_t, struct fuse_file_info *, enum fuse_readdir_flags);

Jextract created this one. Just as with getatter it is part of a functional interface, and we have to provide the implementation.

1
int apply(MemoryAddress x0, MemoryAddress x1, long x2, long x3, MemoryAddress x4);

We have five parameters here, and we will only use the first three. Filler is a little special. It is a helper method provided by FUSE to fill buffer.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
public static int readDir(MemoryAddress path, MemoryAddress buffer, MemoryAddress filler, long offset, MemoryAddress fileInfo) {

    String jPath = CLinker.toJavaString(path);
    fuse_fill_dir_t fuse_fill_dir_t = org.linux.fuse_fill_dir_t.ofAddress(filler);  // A
    fuse_fill_dir_t.apply(buffer, CLinker.toCString(".", rsScope).address(), MemoryAddress.NULL, 0, 0); // B
    fuse_fill_dir_t.apply(buffer, CLinker.toCString("..", rsScope).address(), MemoryAddress.NULL, 0, 0);
    if ("/".equals(jPath)) {  // C
        for (String p : directories) {
            fuse_fill_dir_t.apply(buffer, CLinker.toCString(p, rsScope).address(), MemoryAddress.NULL, 0, 0);
        }

        for (String p : files) {
            fuse_fill_dir_t.apply(buffer, CLinker.toCString(p, rsScope).address(), MemoryAddress.NULL, 0, 0);
        }
    }

    return 0;
}

To invoke methods on filler we need an instance of it; this is done at line “A”. As we talked about earlier, directories have two links in Unix-based file systems. At “B” we make sure every dir has these two links.

At line C we fill we loop over the two lists and add the created files and directories using filler.

Implementing read

This method is called when we want to read the content of a file. The Fuse signature is as follows:

1
int (*read) (const char *, char *, size_t, off_t, struct fuse_file_info *);

Jextract created this as part of a functional interface for us:

1
int apply(MemoryAddress x0, MemoryAddress x1, long x2, long x3, MemoryAddress x4);

The method is passed a buffer that we need to fill with the content of the requested file.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
public static int read(MemoryAddress path, MemoryAddress buffer, long size, long offset, MemoryAddress fileInfo) {
    String jPath = CLinker.toJavaString(path).substring(1);

    if (!isFile(jPath)) {
        return -1;
    }

    byte[] selected = filesContent.get(jPath).getBytes();

    ByteBuffer byteBuffer = buffer.asSegment(size, rsScope).asByteBuffer(); // A

    byte[] src = Arrays.copyOfRange(selected, Math.toIntExact(offset), Math.toIntExact(size)); // B
    byteBuffer.put(src); // C

    return src.length; // D
}

In the first part of the method, we convert to C string to a Java String and check that we know the file. At line “A” we make a ByteBuffer of the buffer by first getting its memory segment. Next on line “B” we copy the part that is requested by the user. Next, we fill the Bytebuffer with the copied range and return the length; so FUSE knows how long it is.

Implementing doMkdir

This method is called when we create a directory.

This is the FUSE signature.

1
int (*mkdir) (const char *, mode_t);

Jextract created this as part of a functional interface for us:

1
int apply(MemoryAddress x0, int x1);

When FUSE calls this method, we only convert the C String to Java and add it to the list of directories.

1
2
3
4
5
static int doMkdir(MemoryAddress path, int mode) {
    String jPath = CLinker.toJavaString(path);
    directories.add(jPath.substring(1));
    return 0;
}

Implementing doMknod

This is the fuse signature that we are going to implement.

1
int (*mknod) (const char *, mode_t, dev_t);

Jextract generated method:

1
 int apply(MemoryAddress x0, int x1, long x2);

When FUSE calls this method we call the helper method addFile to add a file to the list of files and create a key value pair in the file content map.

1
2
3
4
5
static int doMknod(MemoryAddress path, int mode, long rdev) {
        String jPath = CLinker.toJavaString(path);
        addFile(jPath.substring(1));
        return 0;
    }

Implementing doWrite

The Fuse signature:

1
int (*write) (const char *, const char *, size_t, off_t, struct fuse_file_info *);

Jextract generated method:

1
int apply(MemoryAddress x0, MemoryAddress x1, long x2, long x3, MemoryAddress x4);

The do write has a buffer parameter containing the bytes we need to save in memory.

1
2
3
4
5
6
static int doWrite(MemoryAddress path, MemoryAddress buffer, long size, long offset, MemoryAddress info) {
        byte[] array = buffer.asSegment(size, rsScope).toByteArray(); // A
        String jPath = CLinker.toJavaString(path).substring(1);
        filesContent.put(jPath, new String(array, java.nio.charset.StandardCharsets.UTF_8));
        return Math.toIntExact(size);
    }

At line “A” we create a segment with the memory address of the buffer and the size. With those, we create a ByteArray that we can convert into a String and store inside the file content map.

Filling Fuse operations and starting FUSE

We have implemented all the methods for a basic file system. Now it is time to add them to the fuse operation structure.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
public static void main(String... args) {

    System.load("/usr/lib64/libfuse3.so.3.10.5");

    args = new String[]{"-f", "-d", "/mnt/test/"};

    files.add("file54");
    filesContent.put("file54", "content of file54");

    try (var scope = ResourceScope.newSharedScope()) {
    rsScope = scope;
    var arguments = Arrays.stream(args).map(s -> CLinker.toCString(s, scope)).toArray(MemorySegment[]::new);
    var allocator = SegmentAllocator.ofScope(scope);
    var argumentCount = args.length;
    var argumentSpace = allocator.allocateArray(CLinker.C_POINTER, arguments);

    MemorySegment operationsMemorySegment = fuse_operations.allocate(scope);

    fuse_operations.getattr$set(operationsMemorySegment, fuse_operations.getattr.allocate((path, stat, fi) -> getAttr(path, stat), scope)); // A
    fuse_operations.readdir$set(operationsMemorySegment, fuse_operations.readdir.allocate((path, buffer, filler, offset, fileInfo, i) -> readDir(path, buffer, filler, offset, fileInfo), scope));
    fuse_operations.read$set(operationsMemorySegment, fuse_operations.read.allocate((path, buffer, size, offset, fileInfo) -> read(path, buffer, size, offset, fileInfo), scope));
    fuse_operations.mkdir$set(operationsMemorySegment, fuse_operations.mkdir.allocate((MemoryAddress x0, int x1) -> doMkdir(x0, x1), scope));
    fuse_operations.mknod$set(operationsMemorySegment, fuse_operations.mknod.allocate((MemoryAddress x0, int x1, long x2) -> doMknod(x0, x1, x2), scope));
    fuse_operations.write$set(operationsMemorySegment, fuse_operations.write.allocate((MemoryAddress x0, MemoryAddress x1, long x2, long x3, MemoryAddress x4) -> doWrite(x0, x1, x2, x3, x4), scope));

    fuse_h.fuse_main_real(argumentCount, argumentSpace, operationsMemorySegment, operationsMemorySegment.byteSize(), MemoryAddress.NULL); // B
    }
}

In the code above you see the finished main method. We added six method calls inside the resource scope to add methods to the fuse operation structure. We also added fuse_h.fuse_main_real to mount our filesystem.

At line “A” you see how we add the getattr method to the fuse_operations. What happens at this line is that we call the fuse_operations class and tell it we want to set the getattr method on our the fuse operations MemorySegment (first parameter). The second parameter creates a memory address of the lambda method that the generated Jextract code and Fuse can use. The last parameter is the scope that is the owner of the used memory segments and memory addresses.

We need to do this for every method we want to FUSE to call. These six calls share the same pattern. We only have to point the lambda to the correct function and call the matching functions on fuse_operations.

At line “B” we mount our file system. We call fuse_main_real on the fuse_h class with the arguments from args, the fuse_operations of which we have implemented six methods, and the size of the structure. When the application has started, you can create files and directories inside the mount point. The program keeps running till you stop it or the file system is unmounted. You can unmount it using fusermount -u {MOUNT_LOCATION}. *Note If you stop the application yourself you still have to unmount it.

Conclusion

You did it! We created an in-memory file system in Java using the foreign linker API. We used Jextract to generate Java classes from C header files. Used the Clinker convert Java string to C String and the other way around. We also called the C function fuse_main_real directly from Java code. Created six upcalls that FUSE can call when an event happens inside the filesystem.

Source code and references