• 0 Posts
  • 46 Comments
Joined 2 年前
cake
Cake day: 2023年6月21日

help-circle









  • I explored the source of file(1) and the part to determine file types of text file seems to be in text.c: https://cvsweb.openbsd.org/cgi-bin/cvsweb/~checkout~/src/usr.bin/file/text.c?rev=1.3&content-type=text/plain

    And especially this part:

    static int
    text_try_test(const void *base, size_t size, int (*f)(u_char))
    {
    	const u_char	*data = base;
    	size_t		 offset;
    
    	for (offset = 0; offset < size; offset++) {
    		if (!f(data[offset]))
    			return (0);
    	}
    	return (1);
    }
    
    const char *
    text_get_type(const void *base, size_t size)
    {
    	if (text_try_test(base, size, text_is_ascii))
    		return ("ASCII");
    	if (text_try_test(base, size, text_is_latin1))
    		return ("ISO-8859");
    	if (text_try_test(base, size, text_is_extended))
    		return ("Non-ISO extended-ASCII");
    	return (NULL);
    }
    

    So file(1) is not capable of saying if a file is UTF-8 right now. There is some other file (/etc/magic) which can help to determine if a text file is UTF-7 or UTF-8-EBCDIC because those need a BOM but as you said UTF-8 does not need a BOM. So it looks like we are stuck here :)




  • I think there is an argument to be made that if you want to develop a game, for example, for the PS5 you can rely hone your game to the PS5 hardware and it could be extremely stable. This is not possible for PCs because PCs do not have fixed hardware.

    However I think this was true in the olden days of the SNES where games where not glitchy compared to DOS gaming where hardware compatibility was all over the place. You can see this on YouTube channels like LGR where finding a compatible sound card is a challenge.

    But like you, I don’t find that this is still true for modern PC gaming.