AI is Going in the Wrong Direction

This post is gonna be a lot more technical than my usual ones, but don’t get mistaken because this is still just me venting my thoughts on a topic. In no way should this be treated as an academic source or anything like that. I just want to use my knowledge on machine learning and AI to voice my thought on the current direction we’re going with AI and why it’s not what anyone expects.

A Brief Overview (what even is AI?)

To begin, I would like to explain some things about AI because many people online spread so much misinformation about it that I’m sure most of you have already been mislead by a lot of arguments made around the subject. So what even is AI? As everyone knows, it stands for artificial intelligence, but that’s not exactly what it is, at least in the field of computer science. See, in computer science, AI is a very, very broad term that ranges from things like pathfinding algorithms (like the one you use for Google maps and stuff), puzzle solvers, and clustering algorithms, to the programs that people mainly associate with AI like LLMs (ChatGPT, Gemini, Copilot, etc.) and image generator models. They’re basically just algorithms designed to solve problems that were previously thought to require intelligence. Most “AI tools” that people think about fall under Generative Artificial Intelligence, or GenAI for short, as they are built on models designed to generate an output of data based on an input. Now, I could go into detail about the societal impacts of GenAI in modern society and the negative impacts of massive data centers running LLMs, but that’s not the thing I want to focus on today, especially since there is so much misinformation online about these things (seriously, the amount of times I’ve seen people post about the amount of water LLM data centers use without researching further or doing the math to see how exaggerated the figure is concerns me. I’m not denying that these data centers have severe environmental effects, but damn, do some actual research). No, the focus for today is on the idea of Artificial General Intelligence (AGI) which is the concept of AI that most people are familiar with from science fiction and movies, and why at the current rate, we are not going to achieve that very soon.

Artificial Neural Networks

Let me just cut to the chase. The tools that people call “Artificial Intelligence” right now, they’re nothing more than just a really complicated math function. That’s it. Remember in your middle school and high school math classes how you’d have a function f(x) = something something and you’d plot it on a graph? Take that to the extreme with many more inputs and a much bigger graph with many dimensions, and you have ChatGPT, the thing that a lot of people think is going to take their jobs. And as for training those models, remember in science class how you’d collect data in an experiment and then plot those points on a graph and draw a line of best fit? That’s basically how these “AI” models are trained. There’s some calculus involved, but we’ll get into that later, and that doesn’t take away from my point that these AI programs are nothing more than just really big math functions.

Some people hear “Neural network” in discussions about AI and think that it’s the computer simulating the neurons in the brain, but no, it’s literally just what I described earlier: a really big math function. Now, there are some approaches to neural networks that attempt to model some of the more complex functions of real neurons, but because of a thing called backpropagation (which I will get into later), the types of artificial neural networks used today are just really complicated math functions like I said.

So where does the name come from? Well, when we draw these neural networks, we represent the coefficients of the function as layers of nodes or like circles feeding information to every node in the next layer drawn with lines or arrows until we reach the output layer. This tends to look a lot like the neurons in the brain with the lines being like the synapses between them, but as we’ll get into later, it’s not quite that. Anyway, the input nodes are just the variables you put in the function, so like f(x,y,z) would have 3 input nodes in the input layer. Then, for each node in every subsequent layer, all the nodes of the previous layer are multiplied with the weight or value on the arrow pointing to that node and added together then put through an activation function (not explaining that one).

Neural Network
Diagram of an artificial neural network I got from Reddit

Because every node is connected to every node in the previous layer, we can represent all this multiplying and adding with matrix multiplication. GPUs are designed to do matrix math really fast, so that’s why demand for GPUs skyrocketed in recent years. Before we go on, I want you to note that information in a neural network only flows in one direction from the input to the output. There are some neural network architectures that involve flowing information backwards at some points for sequential data, but my point is that these things are designed to be used as math functions where you put numbers in and get numbers out.

And as for training, this is a little more complicated to explain in layman’s terms, especially since it took me a long time to grasp it, but essentially it’s just a bunch of calculus. You’re calculating the partial derivatives of the function and using it to make small changes to the coefficients. Yeah, that sounds complicated. Let me fix that by relating it to the science class thing. Imagine you want to find your y=mx+b function for the line of best fit for your data. You want that line to not just fit your data but also everyone else in the class who did the experiment. So first you start with random values for m and b and draw a line through your data. It’s a bit off, so you check how far off each dot is from the line and average it to make small adjustments to m and b. Then you do the same thing with the next batch of data from your classmate. You do this over and over again, even doing multiple passes over the same set of batches until you have a nice and neat y=mx+b function that draws a line that fits the data of everyone in the class. Then, you draw this line on data from another class to see if your function fits theirs too. Your class’s data was the training data and the other class was the testing data. That is basically what training is on neural networks.

The Limitations

So now you know that it’s all just a really big math function. It’s still impressive how just a bunch of numbers being multiplied and added together can achieve what we’ve done with AI tools. But there’s a polluting, heat-producing, energy hogging elephant in the room. Sure, you can do these calculations, but when you scale it up to the level of say a Large Language Model (LLM) like ChatGPT, you’re gonna be doing a lot of calculations at a very high speed and using up a lot of memory just to hold the values you’ve calculated. Sure, there’s technology being developed to make the whole pollution problem largely irrelevant like analog matrix chips and quantum computing, but we still have the problem that neural networks just don’t scale up very well. As you make them bigger and more complex, you just start to see some diminishing returns, to the point that just to achieve the cognitive ability of even the dumbest human, you’ve built a data center the size of a city. Surely if humans are able to achieve cognition at a minuscule fraction of the energy and resource consumption of a neural network, we must be going in the wrong direction.

Also, I would like to note that not all LLMs require these insanely large data centers to run for one person as some people are even able to run Deepseek on their own computers. Sure, it’s nowhere near the performance of something like the latest ChatGPT models, but it just goes to show that this technology does not need to be as harmful to the environment as it is.

I know I said before that the “neurons” in a neural network are not exactly analogous to real, biological neurons, but I’d like to bring back the analogy for this next point. I’m sure you’ve heard that you only use 10% of your brain. What that means is that at any given moment, only about 10% of all the neuron cells in your brain are being used. This is because using all the neurons just for the current situation would waste a lot of energy, so the brain and the rest of the nervous system try to find ways to connect neurons in ways that save energy while still accomplishing the same stuff. This is why when you learn, things start to feel easier over time since you don’t have to think as hard to do the same things. It’s like when you first learn multiplication, it’s hard at first because you have to keep adding up numbers a bunch of times, but eventually, after doing your times tables enough times, your brain automatically knows what number is the answer without you having to do the adding again. This is also how muscle memory works because sending signals to the brain uses up a lot of energy, so your nerves eventually learn to do those movements on their own (and also why once you develop muscle memory, you often forget how to consciously do those things).

Now, the thing is some older implementations of Artificial Neural Networks do emulate this behavior of using only 10% of the neurons, like in NeuroEvolution of Augmenting Topologies (NEAT); however, that’s not really a thing with the fixed-topology architectures we see in a lot of AI systems because all of the neurons are involved in every calculation. Actually, it’s even worse than that because sometimes trained networks will have certain neurons that don’t contribute but still get calculated because the computer doesn’t just skip over these values since it’s a matrix multiplication operation. This is why LLMs use up so much energy since they’re stateless machines that perform the same complex calculations for every input. I believe I’ve found a possible solution to this, but I’ll get into that later.

I mentioned the term “fixed-topology architecture” earlier. This basically means that the neural network has the same overall structure the whole time and the only thing that changes are the weights between the neurons. The neural network will only ever do what it’s designed to do and will only find patterns in the data that it is given. The interesting thing is that humans also have a fixed number of neurons. Neuron cells don’t divide like every other cell in the body, so from the moment you are born, you have the same neurons for the rest of your life. The reason why that’s not fixed topology is because of what I said before, that the connections between neurons develop over time. The different regions of the brain are kind of like the different components of a neural network in that each one has its own purpose, but the brain forms these architectures naturally and can even rearrange those regions into recursively smaller subregions whereas an ANN needs to be designed specifically for its purpose. ANNs don’t optimize their energy usage like the human brain does because they’re not designed to. They’re just meant to minimize the value of a function.

Finally, I would like to bring up the importance of sensory and physical interactions to human intelligence. I’m no expert on childhood development, but I do know that early on it is important for a child to learn from playing. By playing, I mainly refer to just interacting with the world around us in general. Our senses are the only ways for us to receive information about the physical world and so in early developmental stages of life, we spend it building up an understanding and intuition of our environment. We learn what things hurt, what things feel good, what happens when we do something, etc. And as we get older and gain more experiences, we learn how to connect these sensory experiences in abstract ways and gain the ability to form not just a model of the physical world but also more abstract concepts like language, numbers, art, stuff that we generally associate with the human experience. Also, this is where we get terms like “visual learner” or “audio learner” because we tend to think in terms of sensory experiences as those are the things that allow us to model our internal world. Me, I’d say I’m a haptic learner because I think in terms of physical movements and textures and vibrations.

For a computer to truly emulate human intelligence, it would have to be able to interact with the physical world so that it can understand how physical objects interact and develop an intuition for these things. There’s a popular argument in philosophy that all abstract thought is parsed as language and I’ve heard people online point to LLMs as proof of this argument since they can answer questions with only encoded language as input. I’ve always been opposed to that whole language argument, and I find that LLMs give more proof to my argument because they have shown many times to have a very limited understanding of the physical world. Without getting into too much detail, LLMs basically just encode words as vectors in high dimensional space and then adjust those vectors based on their relationships with each other and then produce an output based on what sequence of words make the most sense to follow up with. It’s very good at answering questions that have already been answered by someone before because its understanding of language is built on billions of examples of text, but its understanding of the physical world is based only on data from text. That’s why it can’t understand that a cup with no bottom and only a lid is just an upside down cup, unless it has trained on enough examples of people describing that exact thing.

We do see computers interacting with the physical world like with robots or cars with cruise control, but those sensory experiences are far different from human experiences. Sensory inputs with those computers are treated as just data to once again be used to minimize a loss function. It will learn to pick up patterns from that data, but it won’t build any complex understanding or intuition from it. It’s kinda like Plato’s Allegory of the Cave in that the computer only ever gets to see shadows of the physical world, but since it cannot directly interact with it, it doesn’t have a deep understanding of it.

A Possible Improvement

There are a lot more limitations with ANNs that I obviously haven’t brought up, but I think I’ve painted a clear enough picture about why this technology cannot lead to AGI, or at least without major costs. So then, what direction should we go in? Here, I propose an idea for an algorithm that addresses a point I made earlier about how the brain is so much more efficient than an ANN because it is capable of only using 10% of the neurons.

My idea takes inspiration from a game called Akinator: The Mind Reading Genie. If you’ve never played it before, it’s a game where you have to think of a character or famous person and then it will ask you a series of questions relating to that character until it guesses correctly. The algorithm it uses is based on a more traditional Machine Learning concept called decision trees. Decision trees are basically just a bunch of questions regarding your data and depending on the answer to the question, you then go to another question which leads to more questions until you find your answer. In the case of Akinator, every time you answer a question, it narrows down the possible answers and also the possible next questions. After enough questions, it will have a small enough list of names to make a guess. No neural networks are involved here, so the most expensive calculations come from just searching up specific data. After training on a lot of data, it starts to find the most efficient connections between questions to get to the most common answers faster.

Notice how similar this is to what I talked about with human neurons in that only a small percent of the system needs to be used for most of the time. This is the key to the algorithm I propose, but before we get into that, I need to explain Attention Mechanisms.

I’ll be honest, I still don’t have a good enough understanding of attention in machine learning, but I will do my best to explain it in a way that you can understand. In neural networks, an attention mechanism basically makes it so the system can look at how each thing of data relates to each other so that it knows which parts are most relevant to what it’s looking for, or what to pay attention to. In language models, we use this to make sure a model knows the meaning of a word with the context of the words around it.

I’m not gonna explain the math because even I barely understand it, but I will explain the basics. Imagine you’re trying to find a book in the library to answer a question you have. The library has no computer, so you can’t exactly search for your question, but you notice the library is sorted in such a way that books with similar concepts or keywords are on the same shelf, or at least close to each other. So using your question, you figure out what keywords to look for, and now you’ve narrowed it down to a few books that may contain the answers you seek. That is obviously an oversimplified explanation, but hopefully you see what I’m getting at.

Now, continuing from that analogy, sometimes you won’t get the exact answer you want, but you’ll find information that will help you get closer to the answer, so from what you’ve gathered, you modify your question a bit and then keep searching. After enough questions, you find your answer. And now, with every answer you seek, you get a little bit better at navigating the library until you’ve formed a set of questions that you can consistently ask to get you to where you need.

To avoid confusion for this next part, I’m gonna change a bit of terminology. I’m gonna use the term Multi-Layer Perceptron (MLP), the old terminology for Neural Networks for the smaller neural networks in this proposed algorithm. This is because the algorithm will be using these MLPs as an analogy for neurons instead.

Basically, this new algorithm combines MLPs with decision trees by trying to form a network of MLPS that lead into each other like a decision tree before coming out with an answer. In attention mechanisms, the question, or “query” is represented as a linear transformation which can just be a tensor or matrix (don’t worry, you don’t need to have passed precalculus to get all this). Same thing with the keywords and the contents of the books. The neat thing is that MLPs in their simplest form are just linear transformations, so we can have each “neuron” be an MLPs asking a question and the answer will either lead to the output or to another question if it decides it doesn’t have enough information. On the backwards pass, it will adjust the weights of the query matrices so that over time, it will form the best questions to find the correct output. Most importantly, the loss function will have emphasis on how many questions it had to ask to get to the right answer, so it will be incentivized to be using much fewer neurons and form more efficient connections over time, much like a real brain.

The problem is that I don’t understand enough about backpropagation yet to implement this system, and also, with what I do already understand about it, training will still be incredibly inefficient since you’re gonna need a lot of memory to load the training data in batches, but what it does mean is that theoretically, this algorithm becomes less computationally expensive over time, at least in evaluation, which means that the final product should only use a fraction of the computational power that a network of the same size would need.

Those of you who are more well-versed in machine learning might notice that this is similar to a Recurrent Neural Network (RNN), but that is not the case. RNNs use the same weight matrix for every input and just uses the previous hidden state to influence the next output. My proposed algorithm has several different weight matrices, but only the ones relevant to the input-output relationship get use in calculation. It’s not taking in sequential data necessarily, but rather it forms a series of queries that each have their own way of reading the data and tries to optimize the way these queries form.

This system could probably improve the performance of the massive AI models we have in place right now, but I don’t think this is the way to AGI. For that, we need to address the core problem I didn’t mention which is that computers are still built on digital architectures. AI models rely heavily on floating point operations which are extremely inefficient at the scale that they’re being used. Some companies have started investing into analog matrix chips which have their own problems, but are far more efficient and energy-friendly than GPUs in terms of tensor calculations. Also, a lot of research has gone into quantum computing which will definitely be the future of parallel computing in general. But for now, all this investment into fixed-topology MLP networks is just gonna make the AI bubble pop really loud.

Closing Thoughts

This whole post has basically just been an excuse for me to rant about my thoughts and feelings towards the current state of AI technology. I may do a follow-up either on my proposed algorithm or another rant more focused on the use of AI in society, but yeah for now these have been my thoughts. I do wanna start writing more of these posts again, but it’s just been so hard to balance school life, office life, and personal life that I just rarely find the time. I started writing this post over a week ago and I also have one in the drafts that’s been sitting there for several months now. Till then, keep thinking!

Addendum

I’m writing this quite a few weeks after I published this post because after working on the newly proposed architecture, I found that transformers (the neural network architecture used in LLMs and other advanced AI models) are in fact Turing-complete. In layman’s terms, this means that they are capable of emulating the operations of any computing machine, even itself. This still does not take away the overall point of the post because Turing-completeness is not enough to prove that a system is capable of emulating human thought. Maybe it is, but we don’t have enough understanding of the human brain to really confirm that, and even then, a system being Turing-complete does not mean it will be efficient.

Also, looking back on my proposed algorithm, it still has a lot of the same problems that I mentioned before, and I just didn’t think through it enough. I’m still going to try out my own stuff for fun, but I think it will be a long long while until someone truly solves it.

Written Language

Recently, I’ve started learning Mandarin Chinese. Like most people, I came in thinking the complex writing system would be the hardest part, so I was pleasantly surprised to learn how intuitive it actually is. Unlike most writing systems, Chinese hanzi do not provide any direct information about how to pronounce them and only represent meaning. At first, this seems very difficult and unintuitive as you would have to either look up the character in a dictionary of some kind or ask a native speaker (or even use a TTS) to know how it sounds and even after that, you still need to remember how it’s pronounced and even how to say the tones (which is hard for people unfamiliar with tonal languages) to be able to read it aloud again. However, as I started to use the language more (which, as of me starting this article, has only been a week), I found a beauty in the separation of the written language from the spoken language.

See, one of the first things I learned about hanzi is that they can mostly be read in any language that uses them whether it be Mandarin, Cantonese, Taiwanese, Japanese, Korean, etc. since they just contain meanings, not phonetic information (obviously this is a gross oversimplification, but I won’t get into that here). This means that the same characters can be pronounced differently by different languages or dialects but still have mostly the same meaning. I didn’t fully appreciate this until I started to actually use hanzi to text my Chinese friends (btw, typing in Chinese is wonderful and I’ll talk more about that later) and found that even if I had forgotten how to pronounce a character, I could still remember what it means and sometimes just read it in English in my head. Of course, I still look up the pinyin to remember how to pronounce it so I can actually speak it, but I really appreciated how I could just look at a character and know what it’s supposed to mean before knowing how to say it. With alphabet-based languages like English, reading works the same way as over time, you eventually become familiar with the shape of the word, not the individual letters and they have the advantage of still having phonetic information to pronounce them, but as I learned more about Chinese, I started to appreciate just how much meaning can be put into each character.

Now, I am going to butcher this explanation as I am not yet an expert on Chinese characters, but I hope I can still make you appreciate how it works. Basically, hanzi are typically made up of multiple smaller characters squished together into one character. For example, 好, meaning “good” is made up of 女(woman) and 子(child). 我(meaning I or me) is made up of 手(hand) and 戈(meaning spear, halberd, etc.). Even then, those simpler characters can sometimes be broken down further, or you can even go the other way around and form more complex characters with new meanings. Most of the time, there is no logical coherence to how these characters are formed, mainly due to the language itself evolving naturally over several thousands of years of people using it, but it does mean that new characters can be created. I think the best way to show this is with the Chinese periodic table of elements. The video I embedded below best explains it, but I will try my best to give a brief summary. Basically, each hanzi has a specific smaller character within in called its “radical” which helps to identify it (think of it like the first letter in a word so you can look it up in a dictionary). With the periodic table, each element’s hanzi is given a radical to represent its state of matter at room temperature, so gasses like oxygen (氧) would have the radical 气 to indicate that they are normally gasses, while Mercury (汞) has the radical 水(water) to indicate that it’s a liquid at room temperature. The video explains much better the brilliance of how the rest of the character is formed, but my point is that every single one of the elements in the periodic table gets its own unique character that tells you a basic property of that element and also makes it so that they can be pronounced with a single syllable!

Before I continue to my next point, I’d like to talk a bit about typing in Chinese because it is perhaps my favourite thing about the language. There are obviously many different ways to input Chinese characters into a computer due to the complexity of the writing system, but the way that I and many other people type is using the pinyin keyboard. For those unaware, pinyin is the official and most common system for romanizing Chinese characters (i.e. turning them into Latin alphabet letters). This means that you get to use the same keyboard you’d use for typing in English to type in the pinyin for the words you want and then you get a text prediction like on your phone keyboard that shows a list of hanzi that are pronounced that way. Now, at first it just seems like the same as typing any other language since you’re just typing in the letters as you’d pronounced them, but I found that the text prediction is smart enough that most of the time, you can type a whole sentence with a few key presses! For example, say I wanted to type 我是快乐 (I am happy). I could just type the pinyin “Wo shi kuaile” (idk how to type the tones); however, I could also hit the keys “W S 1 K L 1” and get the same result (btw, the “1” means that I chose the first prediction that showed up). That’s 6 total key presses compared to the 13 I would have done if I typed the whole pinyin. Unfortunately, I still have to use a handwriting input to type less common characters like some people’s names, but in every day use, the pinyin keyboard is extremely fast at inputting information. I’m normally not that fast at typing in English (even with text prediction) and I’m not fluent enough in Chinese yet to know how much faster I am typing in it, but I’ve found that when I switch back to texting in English, it feels so much more clunky than when I type in Chinese, and it made me really appreciate the language so much more.

I’ve been hyping up written Chinese for this whole article so far. Does that mean I think it’s the greatest system in the world and that everyone should switch to it so we can all experience its greatness and 中国永远? No, not all all. But learning all this about the language made me really appreciate the diversity of languages around the world and why we should make an effort to study and preserve languages as much as we can. It’s especially important to preserve both the written and spoken forms of languages because it’s the relationship between both parts that make up the whole language. With Chinese, there is a massively interdependent relationship between the spoken form and written form as there are some things that can be conveyed in the written language but not the spoken language and the same thing in the opposite direction. One cannot be fully understood without the other and if suddenly all Chinese speakers (and computer TTS programs) disappeared and we only had the written texts or conversely if we lost all records of written Chinese and all Chinese speakers had to only speak the language or write with a different system, then there would be so much of the language lost. It would basically be dead. And this interdependent relationship between the spoken and written language made me think about how other languages have a similar thing going on. Maybe not as much, but let me explain.

I think the first thing that made me think about all this is capital letters. Grammatically, capital letters signify the beginning of a sentence or a proper noun, and other languages that don’t have an uppercase set of letters usually have their own way of signifying these or don’t even need to at all. However, from a cultural perspective, capital letters provide some things that can’t be truly represented in other languages. For example, TYPING IN ALL CAPS LIKE THIS CAN SIGNIFY YELLING WHICH NORMALLY CAN ONLY BE CONVEYED THROUGH SPOKEN LANGUAGE BUT WITH ALPHABETS LIKE LATIN OR CYRILLIC WHICH HAVE UPPERCASE LETTERS, YOU CAN CONVEY YELLING THROUGH TEXT ALONE. Also, sarcasm and mockery often are difficult to convey outside of spoken language, but iF yOu aLtErNaTe BeTwEeN cApiTaL aNd LoWeRcAsE LeTtErS LiKe tHiS, yOu cAn MaKe iT oBviOuS tHaT yOu DoN’t iNtEnD tO bE tAkEn sEriOuSLy. Some people online have started to use what they call tone indicators to indicate things like sarcasm, mainly to help neurodivergent people who normally can’t pick up on that, but for the most part, I think my point still stands that capital letters are an example of a feature of written language that adds a whole new layer of self-expression to a language.

As I thought on this further, I started to see a lot more examples of things even in English that show the interdependence of written and spoken language. The most obvious thing I could think of are jokes and puns. Some puns and jokes only make sense when spoken aloud, some only when written down, and some require the cultural context of both the spoken and written language. I can’t name any specific ones in English off the top of my head, but there is a joke I will mention later that only makes sense spoken aloud, but that has more to do with my later point about interdependence between completely different languages. Anyway, I digress. With written English, you can also see examples of how the written form of a word doesn’t always represent the spoken form, almost like with what I mentioned earlier about written Chinese being separate from the spoken form. With English, it’s mainly due to the inconsistent spelling rules, but the concept is still there. For example, “there”, “their”, and “they’re” are pronounced exactly the same in spoken English and cannot be differentiated on their own without context, but their written forms can be instantly distinguished at a glance. On the other end, you have words like “live” which are always written the same way but have a different meaning or even pronunciation and are only differentiated by context or from spoken word. For the most part, there’s not as much of a disparity between the spoken and written language in English as with Chinese, but you can start to see my point that basically every language only really makes sense with both parts.

A lot of my points really just boil down to cultural context because every natural language arises from groups of people communicating each other. That’s why spoken Mandarin can still be understood even though almost all the words sound like another word. The words themselves don’t necessarily convey the meaning, it’s their interaction and placement with other words that do. And of course, no natural language exists within a vacuum as they are spoken by people who speak with other groups of people who may speak other language. This means that every language is influenced by other languages as well. And sometimes, the interaction between two or more different languages can provide new cultural context that can’t be conveyed by either language alone.

If you’re reading this, unless you’ve translated it to your own language, it means you know English and therefore would know that almost all English words are either borrowed or evolved from other languages (to be fair, that’s most languages, but modern English is a relatively young language and unlike most languages, doesn’t really have direct roots to an older language like how Spanish and French stem from Latin). But yeah, like the text in parentheses says, most languages also borrow from other languages. Usually, this happens between neighboring groups of people, like how the Japanese adopted the Chinese writing system into Kanji while also adopting some of their words and at the same time some words in Chinese are borrowed from Japanese. Other times, it happens from huge migrations of people to other lands, like how most Filipino languages have a lot of words from the Spanish language due to the conquistadors settling there or the English word “boondock” coming from the Tagalog word for mountain due to the Philippine-American war. You can even find some huge examples of a bunch of languages mixing together into a unique one like Hawaiian Pidgin which is a mix of a ton of different languages from different groups of immigrants working together. I’m running out of examples that I can name off the top of my head, but hopefully you get my point. Anyway, I mentioned earlier about a joke that can only be spoken aloud to make sense. It’s a joke my dad used to tell when I was a kid, but it only makes sense if you speak both English and Tagalog. It’s a knock-knock joke that goes something like this: “Knock-knock” “Who’s there?” “Ako maba” “Ako maba who?” and it ends there because the recipient of the joke is supposed to realise they just said “I smell bad” in Tagalog and it doesn’t work for someone who doesn’t speak the language because for one, they likely wouldn’t be able to repeat the “Ako maba” line and also, it doesn’t follow the usual “knock-knock” joke structure where the person telling the joke says the punch line. There must be more examples out there to further drive my point, but I think it’s enough to illustrate that these cultural differences are what make the interactions between different groups so beautiful.

I don’t know how to continue without rambling and repeating most of my points, so I’ll end it with this thought I had about forcing everyone in the world to have the same language. There are some people out there who genuinely think that we should all speak the same language or that we should switch to a universal writing system like the International Phonetic Alphabet (IPA). I mean, I think what I’ve mentioned throughout this are enough to illustrate why that’s a terrible idea, but even then, I don’t think these people realise just how much you limit humanity as a whole by having only one language. Even with computer programming languages, we embrace a huge variety of them because even though they are all Turing complete and can mostly do the same things, there are things in some languages that you just can’t do in others. And since language is our way of communicating with other humans, it affects our way of thinking and therefore allows people of different backgrounds have different systems of thought. Sure, English is pretty much a universal language at this point since it’s the language of the internet which spans the whole globe, but having it be the only one severely limits the human experience. And before I end this post, I’d like to rant about the concept of universal translators in science fiction. Most of the time, they’re just used as a plot device so the writers don’t have to come up with new languages or even explain why everyone speaks the same language as humans, but I think what bothers me about them is that they often imply that every thought can be directly translated to another language which simply isn’t true. I get not wanting to come up with new languages, but when they don’t show any cultural differences between alien races, it breaks a lot of the immersion for me. I think an example of where this is done brilliantly is in Star Trek, specifically in an episode of Star Trek: Enterprise where they’re having dinner with an alien race (I don’t remember which one lol it’s been a long time) and the aliens storm off as they are offended by something, but the crew’s translator can’t figure out what because what they’re saying can’t be directly translated to English. To be fair, this isn’t really an example of the universal translator trope as they did have the aliens speaking a different language, but my point is that they put in a little extra effort to show that some things cannot be translated perfectly to other languages and I get that a lot of writers want to keep things simple, but surely it wouldn’t hurt to have a few small details to show cultural differences between us and the alien races. Also, another fun example I just remembered was in Godzilla: Final Wars where one of the Xiliens is asked for his name and he simply tells them to call him X since his name cannot be pronounced by humans. It’s such a small detail, but it just adds so much intrigue to this alien race because it tells us that the way they normally communicate with each other is something that humans can’t do. It’s like how bees communicate with dance moves but probably won’t be able to understand us if we do the same moves since we don’t have the same body parts (unless it’s just the motion of their entire body, but you get my point). My point is that we need to be more aware of our cultural and linguistic differences to really appreciate the human experience. Yeah, I think that’s good enough for now. Okay, back to work.

Warp Technology

Before I begin this discussion, I would like to note that I am not a physicist and any of the science I explain here comes from my own understanding of these topics. This entire blog is meant to be a creative output for myself, so there is little need for properly cited research.


Anyone who grew up with Star Trek would be undoubtedly familiar with the concept of a warp engine. For those unaware, a warp engine bends space around it to allow a spaceship to travel across the universe faster than light. When I was pondering on this concept on a drive home, I thought about how a lot of other miraculous technologies ended up becoming widely available to the general public. The very computer I am using to type this and the one you are using to read this was built upon many decades of research around a technology that was originally used for massive operations like sending a rocket to the moon. Now obviously warp engines would be far more complex than any computer, but it is inevitable that a society that develops them would eventually be able to compact the technology down to something that an average person could use, not just for space travel.

This got me thinking about possible concepts for commercially available warp technology. The obvious ones would involve transport, and not just the intergalactic kind. With advanced enough warp technology, you could build a whole city in which people can get to where they need in an instant by warping the space between them and their destination. Train lines that travel across the country could get to their destinations in seconds with warp technology.

Then I thought about how it could be used to simply compact things. Entire apartment complexes that take up city blocks could be compressed into a pocket dimension that people could simply walk into. Your closet full of junk could become as spacious as the TARDIS from Doctor Who. Space becomes essentially irrelevant when you have a technology that can change it very easily.

Of course, anyone familiar with the concept of relativity would know that bending space inadvertently affects time as the universe’s way of making sure the speed of light stays constant in any frame of reference. Warp technology could take advantage of this effect by creating spaces where time is slowed down or sped up to meet certain needs. For example, one could use this technology in a server room to speed up the calculations of the computers. It could also be used recreationally by having time move in a certain way at home such that people can spend a lot more time resting while time moves normally in the outside world. There are obviously a lot of negative effects that would result in messing with time like this, but I can imagine that a society that reached this point would have figure out ways around those problems.

One really interesting use that I thought about only recently is the use of warp technology to create extremely accurate color displays. I’ll try not to get too deep into quantum mechanics here, but basically, when electrons emit light, they only do so at very specific frequencies due to energy levels being at fixed quantities. Because of this, color displays, no matter the technology, cannot produce the exact wavelengths of every color in the visible spectrum and only get around that with an illusion on our eyes by mixing specific amounts of red, green, and blue light, the colors our eyes are sensitive to. Because of this, most of the colors we see on a display, such as the color yellow, are not the actual wavelengths of light. The effect is still more than convincing enough, of course, but with warp technology, it may be possible to accurately produce light of any frequency in the visible spectrum.

Essentially, you make use of the Doppler effect which is often associated with the sound of something approaching you, but it is also seen with the further edges of the universe where light gets shifted to longer wavelengths due to space expanding between galaxies. With warp technology, you could take advantage of this effect and have something that produces a fixed wavelength of light and bend the space around it such that the color we end up seeing is different. You could then use this to make a beam of light that aims at different pixels on a screen, somewhat like an old CRT television, but instead of electrons hitting phosphors on a screen, they’re the actual photons of the exact colors we want. With how advanced displays are now, I can’t imagine how much of a difference this would make, but you can imagine a company would market the hell out of this and make people think this is the future of color displays.

Continuing on the concept of bending light, there is a concept in physics known as gravitational lensing where objects in space have such strong gravitational fields that light bends around them, much like a lens. With warp technology, it may be possible to build powerful telescopes that simply bend space to create lenses. Since the light is not passing through a physical medium like glass, we end up reducing a lot of the light data lost with physical telescopes. For more everyday uses, this could also be used for cameras. Instead of having to buy a bunch of different lenses for different purposes, maybe there could be a product that uses small warp engines to create any lens you want with gravitational lensing.


Those are the everyday uses I can think of right now, but now I want to shift the focus to perhaps the more obvious alternative use of warp technology: weaponry. A lot of technologies started out as tools of war, so it should be no surprise that warp technology would first be used as a weapon. Surprisingly, I can only name two sci-fi franchises that cover the concept of weaponized warp technology, that being Star Wars and Titanfall (there likely was a Star Trek episode that covered it too, but I can’t remember at the moment). Star Wars technically uses hyperspace technology which is a bit different from warp technology, but since they serve similar purposes, I’ll consider them basically the same here. In The Last Jedi, there is an infamous scene where a character uses a ship’s hyperdrive to destroy an entire fleet of ships in an instant. In Titanfall 2, the enemies use a weapon called the Fold weapon to bend space-time to destroy an entire planet from a distance. Both of these cases make use of space travel technology to create weapons of mass destruction, but I want to cover a bit of the small-scale weapon applications because there’s only so much you can say about planet-destroying weapons.

The first thought I could think of was something similar to the AR-2 from Half-Life 2 which fires pulses of dark energy instead of bullets. Dark energy, in the context of theoretical physics, is basically what causes the space between galaxies to expand rapidly, essentially bending space-time like our warp engines. One could use this to create untraceable weapons as they would not be firing physical bullets but instead small ruptures in space-time to kill enemies.

Another possible application would be some kind of gravity grenade as seen in some futuristic shooter games where instead of a normal explosion, the grenade changes gravity to either pull enemies into it or launch them upwards. With warp technology, this would not only be possible but also could have various other applications like maybe have it push enemies outward like a normal explosion or even freeze the enemies in time like with the time-manipulation I mentioned in the earlier section. This could even be applied to a missile for more destructive purposes and if used as a nuke, it could avoid a lot of the radioactive side effects from nuclear weapons.

A really interesting application would be to use warp technology to curve bullets. If you ever played Angry Birds: Space, you may remember that a lot of the levels made use of gravity zones or whatever they were called to change the trajectory of the birds mid-flight. This could be used to fire bullets at enemies behind cover by bending space to curve the bullet’s trajectory around the cover. If any game developers are reading this, please make a game mechanic in a shooter where you can curve bullets with gravity.

I might expand on this topic a bit more sometime, but for now, these are my thoughts.