Skip to content

Ticket #5047, #5048: Separate mc.keydef file, updated syntax#5061

Draft
egmontkob wants to merge 7 commits intoMidnightCommander:masterfrom
egmontkob:5047-keydef-file
Draft

Ticket #5047, #5048: Separate mc.keydef file, updated syntax#5061
egmontkob wants to merge 7 commits intoMidnightCommander:masterfrom
egmontkob:5047-keydef-file

Conversation

@egmontkob
Copy link
Copy Markdown
Contributor

Proposed changes

Move the escape sequences from the global "mc.lib" and user "ini" file to new global and user "mc.keydef".

Saner escaping, nicer look.

Removed "terminal:" prefix.

Checklist

👉 Our coding style can be found here: https://midnight-commander.org/coding-style/ 👈

  • I have referenced the issue(s) resolved by this PR (if any)
  • I have signed-off my contribution with git commit --amend -s
  • Lint and unit tests pass locally with my changes (make indent && make check)
  • I have added tests that prove my fix is effective or that my feature works
  • I have added the necessary documentation (if appropriate)
@egmontkob egmontkob requested a review from mc-worker March 10, 2026 12:28
@github-actions github-actions bot added needs triage Needs triage by maintainers prio: medium Has the potential to affect progress labels Mar 10, 2026
@github-actions github-actions bot added this to the Future Releases milestone Mar 10, 2026
@egmontkob egmontkob requested a review from zyv March 10, 2026 12:28
@egmontkob egmontkob added area: keybind Key bindings and removed needs triage Needs triage by maintainers labels Mar 10, 2026
@egmontkob egmontkob modified the milestones: Future Releases, 4.9.0 Mar 10, 2026
@egmontkob
Copy link
Copy Markdown
Contributor Author

This is a cleanup before finalizing the real work in #4632 & friends.

Documentation (manpage etc.) is yet to be updated, but I'd prefer to hear a round of feedback first.

Copy link
Copy Markdown
Contributor

@mc-worker mc-worker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A quick view.

lib/terminal.c Outdated
s++;
}

return g_string_free (ret, FALSE);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's return GString itself.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's return GString itself.

Is there a generic guideline here? Should we always prefer GString to char* as retval? How about input parameters?

Or should the API depend on how the internals were the most convenient to implement: if we already have a GString then return that?

I was tempted to change the input of this function, as well as the retval of the decode_controls() function to GString, for the reason that theoretically a terminal's escape sequence may contain a NUL byte (although in practice I really don't think it does, except for Ctrl+@ a.k.a. Ctrl+space which is the single NUL byte, handled elsewhere in the code).

For the retval of this function, as well as the input parameter of the decoding counterpart, the semantics is C-style string: cannot contain embedded NUL byte. What's the point of changing the return type to GString? Also, for consistency, shouldn't we then change the input parameter of the decoder too, making it possibly more cumbersome for the caller? Or have 3 GStrings and 1 char*?

Would love to learn more about the preference in general, and its reasons; thanks! :)

@egmontkob
Copy link
Copy Markdown
Contributor Author

Added a new first commit, moving from keymap to mcconfig a method that I previously copy-pasted into keydef.

I haven't yet addressed the char * / GString issue for decode/encode, I'd like to see a consistent full picture which of the 2+2 input parameters and retvals should be changed, please help me here :)

@mc-worker
Copy link
Copy Markdown
Contributor

I haven't yet addressed the char * / GString issue for decode/encode,

You do use a GString object within a function. Then you destroy it in parts in two different places: the container and some its members in the function, the char array in the caller function. It would be logical to destroy the entire object at once. And if you need the length of the string in the caller function, you have to call strlen() instead of use a lenmember of GString.

@egmontkob
Copy link
Copy Markdown
Contributor Author

You do use a GString object within a function.

I happen to use it, in both the decode and encode function, for no particular reason. Each of them could have very easily been written without GString. Maybe if I had written the code a day earlier or a day later, or in a slightly different mood, I would have gone with plain char *.

I do not write the implementation first, and then tailor the interface to that, this wouldn't make sense to me. I first create the interface, with whatever types that I believe make the most sense, which in this case was char *, then I come up with an implementation, which in this case happened to use GString helpers.

Then you destroy it in parts in two different places: the container and some its members in the function, the char array in the caller function. It would be logical to destroy the entire object at once.

Destroying at two parts is a standard practice, this is explicitly what g_string_free (..., FALSE) was designed for. It's used in mc's codebase around 40 times. There's nothing wrong with it.

And if you need the length of the string in the caller function, you have to call strlen() instead of use a lenmember of GString.

I don't think I use the length in the caller. But even if I did...

I know, for some reason, mc has an extremely strong aversion against running strlen() twice on a string. We're talking about teeny-weeny strings here, very few of them, accessed very rarely (typically only at mc's startup). Maybe we can shave off a few nanoseconds. But that's not necessarily true either.

Accessing data via a GString * requires one more indirection than via a char *, possibly accessing memory at two very different places, which might be worse cache-wise than accessing one neighborhood twice. So it's not clear whether the performance would increase or decrease, having access to len and not having to measure it again. (But, again, we actually don't need the length.) But it's absolutely certainly super negligible, it's the worst use of our time trying to optimize it.

What we need to optimize for is code cleanness, consistency, and quick development flow.

For the decode function, a return value of GString * might convey the incorrect message of supporting the NUL byte. It does not support it. I upgraded all the char * to GString * that are related to learning keys, reading keys from the config file, and it still didn't work for me. (The byte 0x01 Ctrl-A, or ESC followed by this byte also didn't work for me as an "escape sequence" assigned to a key.) There's more work to be done, and presumably exactly zero need for it, so I'd rather not do it now.

And how far would you go with your logic? E.g. mc_config_get_escape_sequence_list() returns a list of strings as char **, now one question for sure is whether to rather return GString **, another question is that if I happen to internally go with GArray then should I return that, just so that the caller has immediate access to the number of strings returned?

I honestly really don't like this philosophy of adjusting the interface: the return value (and you didn't respond: what about the input parameters?) to the implementation, rather than designing the interface first. I really don't like trying to micro-optimize the most insignificant bit, let alone with no information whether it would actually speed up things or not if the caller cared about the string length, let alone the caller now doesn't even care. I really don't like being so afraid of maybe (but in fact: not) calling strlen() an additional few dozens of times per mc startup.

I'm tempted to say that if you firmly insist on returning GString * if it's created in the function body anyway, my proposed solution would be to rewrite the body of those functions not to use GString :)

It might look like I'm just spending 15 minutes to type this comment instead of changing what you asked for. Behind it is much more unpublished work actually trying to adjust the code, and if we're going with GString anyway then trying to add support for embedded NUL bytes, which would also require to change the interface of mc_config_get/set_escape_sequence_list() from char ** to GString **, for which I used GArray as an internal helper, but then again back to the question: shouldn't I return that GArray just in case the caller finds it useful...? and do it in reasonable, well-polished commits in some meaningful order... just to realize during the process that e.g. unittests became uglier, and somehow the whole code just doesn't feel as clean to me as the char * version was. (Surely, I could restart and address your particular immediate feedback only...)

I'd really like to stick to the char * version for now, and focus on more important things... there's a lot left to do with keyboard handling, this PR is only about 1/4 - 1/3 of the way there.

@egmontkob egmontkob force-pushed the 5047-keydef-file branch 4 times, most recently from 98c3c4a to a5ebd7d Compare March 12, 2026 13:51
@egmontkob
Copy link
Copy Markdown
Contributor Author

How do you like the direction of the latest push?

Since I've worked on it a lot, I thought I'd upload it.

The encode_controls() / decode_controls() methods now support embedded NUL in the raw sequence.

encode_controls() takes a char * string and a ssize_t len parameter, the latter potentially being -1 for C-terminated strings; a pretty standard practice seen in GLib and elsewhere so that you don't have to create a GString if you don't have one. Currently there are two callers of this method: mc_config_set_escape_sequence_list() which does have a GString and the unittest which doesn't. I can modify to take GString if you prefer that.

decode_controls() returns a GString * because here the return value is the one potentially containing an embedded NUL.

The retval of encode_controls() and the input param of decode_controls() are still a char * because here we don't need embedded NUL. I still don't like the idea of encode_controls() returning a GString * just because it happens to have one internally, but if you really insist then I'll do it to be able to move forward with the rest. Also please then advise about the input param of decode_controls(), should that remain a char *?

Much of the internals have been reworked to carry a possible internal NUL of the escape sequence, although it doesn't work, it would need further investigation which is not my focus right now. That being said, if we ever happen to need that, the current underlying work might become useful.

Please advise whether I should continue this latest version, or the one before (with char * everywhere), and also if you really insist on converting both methods' output params to GString *.

@mc-worker
Copy link
Copy Markdown
Contributor

Maybe if I had written the code a day earlier or a day later, or in a slightly different mood, I would have gone with plain char *.

You love the hand-made memory management, don't you?

Destroying at two parts is a standard practice, this is explicitly what g_string_free (..., FALSE) was designed for.

Possible is not mandatory.

It's used in mc's codebase around 40 times.

It's not an argument. The code written in old years is needed to be refactored from time to time.

Maybe we can shave off a few nanoseconds. But that's not necessarily true either.

First we avoid micro-optimization, then we avoid any optimization at all. As a result we have a software that works hardly on multi-core multi-GHz CPUs. Don't forget than mc runs not only on desktops but in embedded devices where CPUs are not so productive.

and you didn't respond: what about the input parameters?

Getting char * from GString is just for free. Getting GString from char * requires extra memory allocations.

shouldn't I return that GArray just in case the caller finds it useful

In mc_config_get_escape_sequence_list() you returns array and its length as two separate entities. Why you don't want return a single object of type GPtrArray that contains them both?

mc_config_set_escape_sequence_list() takes NULL-terminated list and its length. Why? One of those is enough. (Who sad GPtrArray?)


I don't want to argue anymore. I just can modify myself your code from my point of view (in the separate branch).

@egmontkob
Copy link
Copy Markdown
Contributor Author

Maybe if I had written the code a day earlier or a day later, or in a slightly different mood, I would have gone with plain char *.

You love the hand-made memory management, don't you?

No, I absolutely hate hand-made memory management, I truly wish we had something even higher level than GLib.

When I said plain char *, I didn't elaborate: I wouldn't have gone with manual dynamic on-demand reallocating. I would have gone with measuring the input's length, and allocating that much, or twice that much for the encoder, which is guaranteed to be enough in this particular encoder/decoder. If I needed to do the reallocation manually, I'd almost certainly find some higher level methods (e.g. GString) to do that for me.

First we avoid micro-optimization, then we avoid any optimization at all. As a result we have a software that works hardly on multi-core multi-GHz CPUs [...]

It's absolutely not obvious that GString * with its on-demand expansion (including reallocating and copying over the data) would be any more effective than measuring the size upfront and allocating a large enough buffer to begin with, something that doesn't need the GString * wrapper on top of char *. Plus, accessing data via GString * rather than char * requires one more indirection. I am absolutely not convinced that your preferred code is any more effective than mine.

What I am convinced about is that it doesn't matter at all. Performance is not going to measurably increase finding whichever of GString * or char * is more effective at a place that's executed maybe around a few dozen times per mc startup.

In mc_config_get_escape_sequence_list() you returns array and its length as two separate entities. Why you don't want return a single object of type GPtrArray that contains them both?

I wanted to stay consistent with the g_key_file_get/set_...() API and their mc_...() wrappers (which you surely may find a bad idea). Also, GPtrArray was not in my active knowledge, I used GArray internally which I honestly hate so I'd rather switch to GPtrArray or whatever else :)

Also, as I've already stated: I wouldn't want the function signature to be based on whatever internals I happened to have picked (manual array allocation or GArray or GPtrArray or whatever). I believe that for the function signature to be easy-to-use and consistent with the similar functions is more important than reflecting its internals in the hope of possibly (although not provenly) shave off a few nanoseconds. As I understand you, you disagree with me here.

mc_config_set_escape_sequence_list() takes NULL-terminated list and its length. Why?

Again, my primary reason was consistency with g_key_file_get/set_...().

One of those is enough. (Who sad GPtrArray?)

... but above you said ...

what about the input parameters?

Getting char * from GString is just for free. Getting GString from char * requires extra memory allocations.

So at one point you argue that char * as the input parameter is fine because there asking for a GString * requires extra memory allocation in the caller, and yet here you argue that the caller should wrap its data in a GPtrArray?

I don't see a consistent guideline from you.

I don't want to argue anymore. I just can modify myself your code from my point of view (in the separate branch).

I don't want to submit code that you really dislike and would jump on to refactor. But I don't yet understand the rationale behind your requests, nor do I see those requests being consistent.

Do you like that in my latest version I changed many char * to GString * where we store the raw escape sequence in memory or pass it around, in preparation of possibly supporting an embedded '\0' byte in the future? Is it okay if I continue from that latest version, rather than the previous one?

I'm happy to change the internals from GArray to GPtrArray.

For the retval: Okay, I can return GPtrArray, reflecting the (soon-to-be) internals of the getter. Surely not what I would have gone with on my own but a reasonable decision that I'm happy to accept.

As for the input parameter of the setter: GPtrArray so that the caller has to make extra allocations? I'm getting mixed messages, please clarify what you'd like to see here.

@egmontkob
Copy link
Copy Markdown
Contributor Author

egmontkob commented Mar 17, 2026

Changed the getter's return type to GPtrArray * – am I getting closer to what you have in mind?

Yet to address the setter's input parameter according to clarified instructions.


I'm not a 100% satisfied with the design of returning GPtrArray *, reason being that it's quite an ambiguous type.

For one, it doesn't convey whether its elements are char * or GString * or whatever else.

Then, it doesn't convey whether the array is NULL-terminated or not. This is only answered for us because NULL-terminated is a recent GLib addition and we support much older versions, too.

It also doesn't convey the message whether the array has been set up to automatically free the members, or not. Currently, for me, it's not. As far as I can see, in order to set up automatic freeing, I'd need a tiny helper function to pass that TRUE to g_string_free() because g_ptr_array_new_full()'s element_free_func only takes a function but no parameters.

Anyone who uses this method either needs to reads its documentation where all these details have to be carefully specified – which I'm yet to do!!! –, or has to check the implementation.

Returning a GString **, and having an additional int *length parameter, doesn't really have this problem. It's an absolutely well-established pattern that int *length should be optional, implying that the return value must be a NULL-terminated list, and then no more questions can be asked. A much less ambiguous design.

This direction of returning GPtrArray * does not feel clean and robust to me.

@ossilator
Copy link
Copy Markdown
Contributor

GString is sort of a pointer type. it wouldn't be unreasonable to just return it by value.

@egmontkob
Copy link
Copy Markdown
Contributor Author

GString is sort of a pointer type. it wouldn't be unreasonable to just return it by value.

Yup (and same for GPtrArray). It's not an existing pattern of mc, though. Out of the many possible solutions I'd preferably pick one that matches our existing practices.

@mc-worker
Copy link
Copy Markdown
Contributor

@egmontkob please look at branch 5047-keydef-file-2.

@egmontkob
Copy link
Copy Markdown
Contributor Author

please look at branch 5047-keydef-file-2.

Only the first of each array of tests is run. In main() you have to change tcase_add_test() to mctest_add_parameterized_test().

Once you do this, the tests tell you that decoding a trailing ^ fails, your refactoring of where you bump the source pointer causes it to jump over the terminating \0 after the lone ^ in the source string and keeps processing whatever it finds there in memory.


In encode_controls() you measure the source string's length and pass it as a hint for the result string's size. This is a bad hint since the string will almost certainly grow, there will be at least one byte that needs escaping. Passing twice the size (I don't know if it needs a +1 for the trailing zero) would prevent further reallocation. Or just +1 (and another +1 for the trailing zero if that's required) because likely only one byte will need escaping. Even not passing any hint is IMO cleaner than passing it a hint that's almost certainly too low.


The encode/decode method taking a const ssize_t len and then computing size_t len2 and using that all throughout is quite fragile: It's very easy to accidentally use len rather than len2 in the body and not notice it. I think having a single variable, overwriting the non-const input parameter is safer in this regard.

At least changing the names to something like const ssize_t maybe_len as the input parameter and deriving from it a local variable size_t len and using that in the body pretty much eliminates this possibility for error.


Honestly, I can see that your preference for code is different from mine, but not that it's necessarily any better (regarding both readability / robustness and micro-performance). Anyway.

I really want to move on and focus on what really matters, so I don't care which version ends up in the code. Could you please fix the aforementioned actual bugs, consider my other suggestions, and submit the code? Thanks!

mctest_assert_str_eq (decode_controls_simple ("\\s"), " ");
mctest_assert_str_eq (decode_controls_simple ("\\\\"), "\\");
mctest_assert_str_eq (decode_controls_simple ("\\^"), "^");
mctest_assert_str_eq (decode_controls_simple ("\\^"), "^");
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicated entry (my bad) - @mc-worker could you please remove its equivalent in your branch?

Move the method to get a config file's full name from keymap to mcconfig,
because in the next commit keydef will need the same method.

Signed-off-by: Egmont Koblinger <egmont@gmail.com>
…o a separate file

Move the escape sequences from the global "mc.lib" and user "ini" file
to new global and user "mc.keydef".

Add command line option --keydef/--nokeydef and environment variable
MC_KEYDEF, analogously to the keymap feature.

Signed-off-by: Egmont Koblinger <egmont@gmail.com>
Repeated keys aren't supported.

Signed-off-by: Egmont Koblinger <egmont@gmail.com>
@egmontkob
Copy link
Copy Markdown
Contributor Author

There's also a compiler warning at a for (..., ..., *p++) about that * being pointless.

@egmontkob
Copy link
Copy Markdown
Contributor Author

Please check the updated PR, based on your modifications.

Separate entries by space rather than semicolon. Semicolons are no longer
escaped.

The escape character is represented by \e, \E or ^[, rather than \\e.

Other special characters also have more standard notation, e.g. ^H and ^?.

Change the internal variables to hold the raw value rather than the
escaped string.

Signed-off-by: Egmont Koblinger <egmont@gmail.com>
This makes us one step closer to supporting escape sequences with an
embedded NUL byte, although we probably won't need this.

Signed-off-by: Egmont Koblinger <egmont@gmail.com>
…ydef section names

Also rename [general] to [_common_], emphasizing that it's not a
terminal name and that it's always loaded.

Signed-off-by: Egmont Koblinger <egmont@gmail.com>
Signed-off-by: Egmont Koblinger <egmont@gmail.com>
@egmontkob
Copy link
Copy Markdown
Contributor Author

I've done some performance measurement

In both cases I query the decoded config file values 100'000 times but then do nothing with them. In both cases I print the cumulative time since mc's beginning, that is, you should only examine the last line of output.

With my early version, the cumulative time is almost always 1.19s on my computer, with the latest version it's almost always 1.33s.

Dividing back with 100'000: The difference is a bit more than 1 millionth of a second, per mc startup.

In favor of my version.

So much about the "let's use GString* instead of char* everywhere because it's faster and a lot of such tiny improvements do add up to a lot" approach. No, it's not faster, it was a blind assumption of yours that turned out to be wrong. And no, it does not matter, you can optimizate by this extent (in the right direction) at 10'000 different places in mc and it will still be under one UI frame, i.e. absolutely not noticeable to the user.

Perf improvements need to be measured, and need to be spent where it really matters.

As for code clarity or style, I still haven't received an overall guideline that I should have followed, or could follow in the future.

We have just wasted 2 weeks over this utterly insignificant GString* vs. char* debate, refactoring the code a few times, bugs sneaking in that had to be fixed, etc. Exactly ZERO of our users will care. What they care about is: Does Shift+F6 do what it's supposed to do?

With this PR getting merged, I'll be roughly 1/4 of the way there. @mc-worker, I know you've seen it, earlier I've had enormous debates/flaming around this topic with Ossi, now it seems I'm about to have it with you. It cannot go on like this. I am not willing to repeat this. If you want me to fix Shift+F6 and friends, you have to be magnitudes more lenient about the tiny insignificant details not looking exactly as you would have done them. Otherwise I'll just walk away and find better things to do.

I know earlier I said "I don't want to submit code that you really dislike" but I just have to revise it. I want to get done with the Shift+F6 fixing, as quickly as possible in a decent quality. We seriously need to make the development workflow much-much snappier. What happened in this PR just won't work in the future.

@mc-worker
Copy link
Copy Markdown
Contributor

Sorry for the long silence. I have some problems with the Internet. I answer you as soon as I can.

_ (key_name_conv_tab[action - B_USER].longname));
mc_refresh ();
if (learnkeys[action - B_USER].sequence != NULL)
MC_PTR_FREE (learnkeys[action - B_USER].sequence);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

g_string_free (learnkeys[action - B_USER].sequence, TRUE);
learnkeys[action - B_USER].sequence = NULL;
/*** declarations of public functions ************************************************************/

gboolean define_sequence (int code, const char *seq, int action);
gboolean define_sequence (int code, const char *seq, gssize len, int action);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name len is here and maybe_len is in the function definition.
I'd prefer the name len for argument and actual_len for variable in the function body. Up to you.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is indeed an inconsistency that I missed.

For anyone looking at the header, examining the interface rather than the implementation, where we don't even have to give it a name, calling it len might be a cleaner choice (although I'm not entirely sure about it). Mind you, for someone (other than the compiler) to look at the header rather than the implementation, we should also move the documentation there – all throughout mc; probably not going to happen.

My problem with len and actual_len is the same as with len and len2. You can very-very easily accidentally use the wrong one and very-very easily miss that during review. Calling them maybe_len and len fixes this problem.

I've been thinking about it in the last couple of days, and my conclusion was that our style guide is wrong (or at least misleading with its wording) and should be relaxed. It says "Avoid abusing non-const function parameters as local variables" and shows an example where a variable is reassigned with something that presumably has a different semantics.

For an optional argument, whose real value can be determined from other circumstances, this should not apply. THE correct solution is what I did originally: call it len all throughout, and fill it in at the very beginning. It's not abuse, it's a core feature of the C programming language that function parameters are non-const by default, you're free to change them as you please. Replacing them by something with a different semantics is indeed abuse, but that's not the case here. Here it's 100% guaranteed that we won't ever need the original value once we override it, and we also want to be 100% sure that we don't accidetanlly use the original value after that point.

I am pretty positive that overwriting the single variable len is the best thing to do here, and it is the pattern I've seen at other places (e.g. across GLib in exact same situations).

If I managed to convince you, I urge you guys to rephrase the style guide, and I will switch to this syntax. If I did not manage to convince you then I'll change the header file to also call it maybe_len.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i agree with egmont.

@mc-worker
Copy link
Copy Markdown
Contributor

From bottom to top.

Exactly ZERO of our users will care. What they care about is: Does Shift+F6 do what it's supposed to do?

What about developers? Changes invisible for users are perfectly visible for developers.

As for code clarity or style, I still haven't received an overall guideline that I should have followed, or could follow in the future.

The link to the our coding style is in the start message of this PR, directly under the word Checklist.

The encode/decode method taking a const ssize_t len and then computing size_t len2 and using that all throughout is quite fragile: It's very easy to accidentally use len rather than len2 in the body and not notice it. I think having a single variable, overwriting the non-const input parameter is safer in this regard.

I agree, but @zyv doesn't. He wrote such way in the coding style: Avoid abusing non-const function parameters as local variables:

In encode_controls() you measure the source string's length and pass it as a hint for the result string's size. This is a bad hint since the string will almost certainly grow, there will be at least one byte that needs escaping. Passing twice the size (I don't know if it needs a +1 for the trailing zero) would prevent further reallocation. Or just +1 (and another +1 for the trailing zero if that's required) because likely only one byte will need escaping.

Since this commit, the initial string size is 64 bytes. I think, it's enough for most cases.. Next grow will reallocate it to 128 bytes. Next to 256 and so on.

@egmontkob
Copy link
Copy Markdown
Contributor Author

egmontkob commented Mar 27, 2026

Since this commit, the initial string size is 64 bytes. I think, it's enough for most cases.. Next grow will reallocate it to 128 bytes. Next to 256 and so on.

I guess that's a good reason not to pass any initial size at all. Not a good reason to pass an almost certainly too low one.


What about developers? Changes invisible for users are perfectly visible for developers.
[...]
The link to the our coding style is in the start message of this PR, directly under the word Checklist.

Tell me more about how it's the char * vs. GString *, or char ** vs. GString ** vs. GPtrArray * change that will make the code more or less readable... (apart from GPtrArray * making it less readable, as I have already argued).

Something that the style guide says absolutely nothing about.

Something where I did not receive a clear, consistent guideline; actually, I did point out contradictions in your ad-hoc guidelines that are still unresolved (e.g. at wrapping the input parameter in a container object).

Something where one of your strong arguments was micro-performance, which I completely debunked.

Something where your request was not driven by having a cleaner function signature per se, but by adjusting the function signature to how its internals happen to look like.

What about my earlier comment about your preferred GPtrArray * being ambiguous in 2 or 3 different aspects where char ** or GString ** aren't, actually resulting in a harder to verify code?


The big picture is:

As you probably know, mc is not the only open source project I every now and then contribute to. There are 2 or 3 (depending on how you count) projects to which I contributed more than to mc, and dozens and dozens to which I contributed less.

There were some that were abandoned, or the developers ignored my changes. But I can't remember any where the developers did care, yet it was such a pain to get things done as in mc. I can't recall any other project that had such a bad ratio of actual work vs. meaningless nitpicking and administrative tasks as mc.

I'd like to remind you again that quite a few of your requested changes were not along the style guide, but along your personal preferred style against mine, not backed up by the style guide but by your incorrect beliefs about performance, or by how you felt the code was nicer, a feeling I often didn't share.

"Perfect is the enemy of good" – I'm sure you've heard it many times. Let alone when that "perfect" isn't even "perfect". When it's not documented as preferred approach, yet I would have to rework my work according to that. Or you rework, introducing bugs along the way. When it's along questionable items in the style guide.

It's not those tiniest insignificant details of the style guide – or, even so: coding preferences not laid down in the style guide – that will make mc better, and easier to maintain in the long run. It's the ability to quickly fix more bugs, quickly refactor bigger problems with the source.

This is not just you @mc-worker, it's also about @zyv – honestly, I was truly pissed off recently when I had to argue that I should be able to move a well-tested method from the obviously wrong file to where it obviously should have always belonged, but otherwise not touch it at all, without having to bring it up to standards with our recent coding styles, including testing that in various build configurations. It was just not the focus of my work.

Fixing Shift+F6 and friends is quite a complex task, a lot needs to be done, and right now probably it's me having the best overall understanding of the situation. I truly want to do this. And by "this" I mean "this", and not "this and a bunch of other things".

I cannot run a marathon if each and every footstep has to be perfect according to partially documented, partially existing-only-in-your-heads preferences, and if along the way I have to pick up every piece of garbage or fix every piece of loose stone that I'd step on.

We are extremely scarce on resources. We have to use those resources wisely. Wisely means focusing on the big things, not on the tiny details. And it also means having fun and satisfaction along the way, to attract and retain more and more contributors.

If I don't follow the style guide somewhere, point it out, I'm happy to fix.

If you have a suggestion, feel free to tell me, I will consider, may decide to adjust the code or may decide to keep my version.

If, at areas not covered by the style guide, such as char * vs. GString *, you require that I rework the code according to your personal preferences rather than mine (a refactoring which you yourself got wrong by the way), I will on principle refuse to do it in the future.

I am more than happy to volunteer and fix major issues with mc, in an effective and fun way. I am not happy to volunteerly do it multiple times in an ineffective process, according to undocumented preferences of the reviewers / project maintainers, if my different approach works just as well. This would take effectiveness, as well as the fun part, out of it. It's not worth that much for me.

Can we please agree that if the new code adheres to the style guide, is not objectively wrong, and is of a reasonable quality and readability, yet is not fully according to the reviewer's taste (e.g. author chooses char * where reviewer would prefer GString *) then it must pass code review? Having to change it from the author's personal taste to the reviewer's personal taste cannot be a requirement?

If we can agree on this then let's please follow this in the future. As for me, I couldn't care less whether you create a followup change to taylor my work to your preferences, just please don't waste any second of my life with that.

If we cannot agree on this then I'm happy to write an overview of the pending drafts and remanining work to be done around the Shift+F6 issue and will wish you guys good luck completing the work and releasing an awesome 4.9!

Do you want me to fix Shift+F6, along with the necessary underlying changes or cleanups? I'm happy to! Do you want me to fix Shift+F6 according to your taste, and fix every other problem I come across along the way? Sorry, I've got better things to do.

With a healthy code review attitide, the Shift+F6 work that is currently around 1/4 complete would probably already be fully done by now. And I would probably be happy to move on to fixing the next real problem. Not the next char * vs. GString *, but the next real problem.

mc is one of the worst projects (if not the single worst) that I've contributed to with its expectations against contributions. This needs to change much more than that char ** needs to change to GPtrArray *.

@egmontkob
Copy link
Copy Markdown
Contributor Author

egmontkob commented Mar 28, 2026

Dear Andrew @mc-worker & Yury @zyv,

Recently some of the code reviews went in a direction that I extremely strongly disliked. I believe the existing review practices really hurt the project, and will hurt it in the future as well, and with regard to other contributors too.

It definitely had a terrible impact not just on the time I needed to pointlessly waste on the project but also on my motivation to continue. I have elaborated these in my previous comment. Having thought about it for yet another day I clearly decided that:

I am halting any coding work I'm doing on mc until you both acknowledge the following.

(I don't care if you follow it for other contributors as well (although I firmly believe you should), but as for myself, I will not continue without these principles being firmly laid down.)

  • Code that is clean, readable, working, and follows the style guide, MUST PASS code review, even if written differently than the reviewer would have written it. For such code the reviewer might make suggestions (preferably with rationale), but NOT requirements. The code's author decides whether to address or reject these suggestions. Even in case of rejection, the author does not need to explain the reasons. The mere fact that for any reason whatsoever the person who did the actual work chose go with that solution MUST outweigh the fact that the reviewer / projects's core developer would prefer it to look differently. If the reviewer / core developer disagrees, he is welcome to do the work himself.

  • Code review can suggest to, but can NOT require to extend the scope of the work, unless it's inevitable (read: if without that the PR is faulty to begin with). This includes, but is not limited to, bringing up to current standards bits of code not touched or marginally touched by the change (moved to a new location, indented/outdented by a level, variable renamed, etc.).

  • While great performance of the software is important, it is not reached via changes that are blindly (and thus perhaps incorrectly) assumed to shave off a few nanoseconds. It is reached via changes of magnitudes bigger impact (e.g. issues 3859, 4891, 4548, 4789) and verified via measurements. An insignificant micro-optimization, let alone an assumed but unverified one, is NOT a reason to require code modification. Every minute we humans would spend on reworking a PR is infinitely more precious than a barely (if at all) measureable amount of CPU time per mc instance.

  • While it's important to have a clean, maintainable and mostly consistent source code, delivering user-facing features and bugfixes with reasonable amount of efficiently used developer efforts is way more important and needs to be the focus of our work, including the review process.

I sincerely hope that you guys agree and that we can continue the focused work on what really matters, rather than wasting weeks (or even just a minute) around the most freaking unimportant details.

@ossilator
Copy link
Copy Markdown
Contributor

whether something is written down in the style guide or not is a meaningless distinction, because ultimately it's andrew and yuri anyway who decide what ends up there.

your complaint about "request" makes no sense to me. this is a very broadly interpretable term, and i perceive it as rather weak (but then, i'm just rude 😜). it would be different if you wrote "demand", or at least "ask for".

one shouldn't reject feelings about what constitutes "nice" code out of hand, because these come from years of experience. conversely, one shouldn't trust them blindly, but instead examine them critically. this discussion needs to be had, though maybe a more systematic approach would be more constructive.

generally, i think it makes sense to keep conversions and other redundant work at a minimum, because it reduces the cognitive load on the reader. that silicon cpus also benefit from it a teensy bit is just a nice side effect. by extension, i think it absolutely makes sense to adjust the parameters and return types to the actual implementations, unless one is dealing with a proper api that needs to adhere to an ownership model.

ret = g_string_sized_new (len + 1); // expect that 1 character will need to be escaped

for (; s < end; s++)
if (*s == 0x1B)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you mentioned this somewhere else as well, but i'd just drop the special handling of escape. ^[ is just fine.

/*** declarations of public functions ************************************************************/

gboolean define_sequence (int code, const char *seq, int action);
gboolean define_sequence (int code, const char *seq, gssize len, int action);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i agree with egmont.

@egmontkob
Copy link
Copy Markdown
Contributor Author

egmontkob commented Mar 29, 2026

your complaint about "request" makes no sense to me. this is a very broadly interpretable term, and i perceive it as rather weak (but then, i'm just rude 😜). it would be different if you wrote "demand", or at least "ask for".

It's the subtle differences in the meaning of words that I don't fully feel as a non-native speaker. I meant it as "demand", "insist on". Sorry for not being clear. I'll edit that comment.

by extension, i think it absolutely makes sense to adjust the parameters and return types to the actual implementations, unless one is dealing with a proper api that needs to adhere to an ownership model.

It's not about which return type is the best. It's about not having to have a discussion about the return type at all. It's whatever I happened to have written it the first place for no particular reason, because having to do it again differently for a possibly marginally better code is an utter waste of time that I'm not willing to invest.

Remotely related: https://xkcd.com/1445/, although for us it's not the cost of analyzing but rather the cost of adjusting from A to B just because the reviewer, with decades of experience, and a false idea of performance, would find that nicer; whereas I, with decades of experience, happened to have picked the other solution which I find similarly nice; and by the way have wasted my time to argue why the requested new variant is not necessarily faster, and even measured that in fact it's not, and have argued why the new variant (at GPtrArray *) is less readable (as some important details aren't conveyed by the type).

It's an utter waste of our time to have to discuss which possible solution is perhaps a tiny bit better, or to have to change the code from one to the other. We just need to fix the problems somehow, anyhow in a decent quality. This means that the first iteration which is of acceptable quality has to land. No time to waste on coming up with further "even nicer" or "even nanoseconds faster" (or, as in this case: actually neither of these two) solutions. That's what I'm talking about. Let's get things done.

@ossilator
Copy link
Copy Markdown
Contributor

the problem is that one can frame that attitude as either "pragmatic" or "short-sighted".

certainly the particulars of this case are utterly irrelevant in the bigger picture, but it's also about setting an example and establishing standards for the future.

so it's always a balancing act.

@egmontkob
Copy link
Copy Markdown
Contributor Author

egmontkob commented Mar 29, 2026

The decision is of course up to mc's core developers.

What I'm saying is: If I ever again have to have a debate about why I'm not changing a function that I've simply moved to the file where it always should have belonged to, or if I ever again have to have a debate about char * vs GString *, or any other similar freaking insignificant issue not laid down firmly in the style guide, then I'm out of here.

It's not worth it for me.

I want to fix important and very complex bugs without having to get delayed by weeks, b.tching on the tiniest insignificant details. If I can't do it in an effective way, spending the required time but not wasting much more than that on it, then I'm not doing it at all, that's fine by me.

I am writing code in my style, within the style guide's rules. I am not writing code in Andrew's or Yury's style. I would let them do that.

@egmontkob
Copy link
Copy Markdown
Contributor Author

... And, once more, I'd like to point out that I have already contributed to many-many-many open source projects, but I haven't ever seen this code reviewing attitude that we see here in mc.

@egmontkob egmontkob marked this pull request as draft March 29, 2026 12:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area: keybind Key bindings prio: medium Has the potential to affect progress

3 participants